TeXcount web service (version 3.2.0.41)

This page runs TeXcount to count the words in LaTeX documents. You can also run other/older versions of TeXcount, or download the script and run it on your own computer.

For help on options and adding macro handling rules, consult the documentation.

Help on options

Amount of parsing details in the output

The amount of parsing detail can be:
- None: no details on how the text was parsed
- Brief: only counted text shown
- Moderate: comments excluded
- Verbose: all code shown
Colour codes are given to indicate how the different parts have been interpreted.

Give counts per section

In addition to the total count, subcounts will be produced setting break points at e.g. section headers.

Give cumulative counts at the end of each line

Show a total sum. Also show the cumulative sum at the end of each line. The sum may count words in text only, all words (text+headers+captions), or words plus the number of formulae.

Strictness of rules to identify words, option, etc.

By selecting the relaxed mode, criteria for what is permitted inside words and options is relaxed. The restricted mode, on the other hand, may be used e.g. if TeXcount tends to join words together.

Ignore or include bibliography in word count

By default, the bibliography is not included in the word count. If the bibliography environment, i.e. thebibliography, is included in the LaTeX document, however, and bibliography is set to be included, it will still be counted.

Beware, however, that the bibliography items may be formatted in a way that TeXcount may not interprete as desired, that all numbers will by default be counted as words. It is recommended that the relaxed mode be used to help avoid some formatting problems such as the use of {U}ppercase format to indicate capital letters.

Note that bibliography inclusion may also by set by including the TC-command %TC:bibinc in the LaTeX document.

Language type and what is counted as words

The default is to count all words: TeXcount then tries to count all words in all languages including 'logographic characters' for which each character is counted as a word (e.g. Chinese). To restrict to alphabetic languages, i.e. languages where words are made up of sequences of letters and separated by space, select the alphabetic option. You can also choose to count letters instead of words.

There are Chinese and Japanese options, which from version 2.3 are no longer required to count Chinese/Japanese characters since they are included in the default option. However, this allows TeXcount to guess at some other file encodings (e.g. GB2312 and Big5 for Chinese), and also restricts the languages and scripts included in the count. Similarly, there are options for Korean (which counts Korean characters) and Korean words (which counts hangul words as words separated by spaces. The alternatives Chinese only etc. try to restrict what is counted, e.g. ignoring other characters and words, but may not be as restrictive as desired.

More detailed control over which languages and scripts are counted is available in the TeXcount script, but not on this web service.

I recommend using UTF-8 for all non-ASCII text. For access to a wider range of encodings, you need to download and run the TeXcount script.

Count word frequencies

Each word is counted (independent of case) and a summary of words produced in descending order. All words may be listed, or only those occurring sufficient number of times. Although the word count is based on lower case only, TeXcount will retain letters that are consistently upper case.

Select LaTeX file to be processed

You may provide a LaTeX file to be uploaded to this site and processed with TeXcount: this requires that no LaTeX code is provided in the text area, otherwise the LaTeX code text area will be used instead of the file. The file does not have to be a complete LaTeX document: an included part of a LaTeX document will do as long as all groups are balanced.

Note that while TeXcount can handle included files, this function is not available on this web service. Hence, only the LaTeX code within the file will be analysed.

File encoding (how non-ASCII characters are represented)

If the document only contains ASCII characters, the choice of file encoding makes no difference. However, for languages that use non-Latin letters, i.e. other than A-Z, these are most likely not ASCII and how they are encoded in the file may depend on the choice of encoding. For most purposes, UTF-8 encoding (a Unicode encoding compatible with ASCII) is recommended. If auto is selected, TeXcount will guess which encoding seems more appropriate.

Allow logging of which macros, environments and packages are used

Allow TeXcount to log which macros, environments and packages are being used, and how many times. This is to help improve TeXcount by providing information of which macros and packages are being used and to detect macros and packages for which support could be improved.

By default, this is turned on. A list of all macros, environments and packages will then be stored on a log file together with how many times they were called. None of the text will be stored.

If you would rather that this information not be stored, you can uncheck the box.

LaTeX code

You may enter the LaTeX code to be analysed directly into the LaTeX code text area. This will be processed in the same way as a LaTeX document of file: i.e. it does not have to be a complete LaTeX document, but may be only a part of it, e.g. a file which is included in a LaTeX document, as long as all groups are balanced.