Strip non ASCII characters from files
mouse 333 · person cloud · link
Last update
2021-04-30
2021
04-30
«damned copy&paste from microsoft word»

Create a toascii.sh file:

1
2
3
4
5
6
7
8
9
10
#!/bin/bash

if [ -z "$1" ]; then
  echo "USAGE: toascii.sh <file1 file2 ...|*.ext>"
  exit
fi

for i in "$@"; do
  LC_ALL=C tr -dc '\0-\177' < $i > $i.tmp && mv $i.tmp $i
done

or use sed:

1
2
3
LC_ALL=C sed 's/[\xe0-\xef]../X/g; s/[\xc0-\xcf]./Y/g'
# input:  "#$%&0@ABab����ほぼぽま~अ
# output: "#$%&0@ABabYYYYXXXX~X

Source: Stackexchange, unix.com