fix some formating

This commit is contained in:
Uwe Steinmann 2025-10-23 13:15:53 +02:00
parent 5a25b7cd3a
commit 7e2803da25

View File

@ -29,37 +29,35 @@ php-fpm's configuration. On Debian this is done in the file
Conversion to text for fulltext search
=======================================
text/plain
text/csv
application/csv
cat '%s'
* text/plain, text/csv, application/csv
`cat '%s'`
application/pdf
pdftotext -q -nopgbrk %s - | sed -e 's/ [a-zA-Z0-9.]\{1\} / /g' -e 's/[0-9.]//g'
* application/pdf
`pdftotext -q -nopgbrk %s - | sed -e 's/ [a-zA-Z0-9.]\{1\} / /g' -e 's/[0-9.]//g'`
If pdftotext takes too long on large document you may want to pass parameter
-l to specify the last page to be converted. -q is for suppressing error/warnings
`-l` to specify the last page to be converted. `-q` is for suppressing error/warnings
send to stderr
mutool draw -F txt -q -N -o - %s
`mutool draw -F txt -q -N -o - %s `
application/vnd.openxmlformats-officedocument.wordprocessingml.document
docx2txt '%s' -
* application/vnd.openxmlformats-officedocument.wordprocessingml.document
`docx2txt '%s' -`
application/msword
catdoc %s
* application/msword
`catdoc %s`
application/vnd.oasis.opendocument.text
odt2txt %s
* application/vnd.oasis.opendocument.text
`odt2txt %s`
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
xlsx2csv -d tab %s
* application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
`xlsx2csv -d tab %s`
application/vnd.ms-excel
xls2csv -d tab %s
* application/vnd.ms-excel
`xls2csv -d tab %s`
text/html
html2text %s
* text/html
`html2text %s`
Many office formats
unoconv -d document -f txt --stdout '%s'