Strip non ASCII characters from files
Last update
2021-04-30
2021-04-30
«damned copy&paste from microsoft word»
Create a toascii.sh
file:
1 2 3 4 5 6 7 8 9 10 | #!/bin/bash if [ -z "$1" ]; then echo "USAGE: toascii.sh <file1 file2 ...|*.ext>" exit fi for i in "$@"; do LC_ALL=C tr -dc '\0-\177' < $i > $i.tmp && mv $i.tmp $i done |
or use sed
:
1 2 3 | LC_ALL=C sed 's/[\xe0-\xef]../X/g; s/[\xc0-\xcf]./Y/g' # input: "#$%&0@ABab����ほぼぽま~अ # output: "#$%&0@ABabYYYYXXXX~X |
Source: Stackexchange, unix.com