News

Release of version 3.2

Wed, 05 Aug 2020 00:19:43 +0200

Some minor fixes from the beta version, and some more macro and package rules added: some based on previously received suggestions. My appologies for submitted suggestions I had, or have, overlooked.

Also changed to include amsmath rules by default.

top

Release of version 3.2.beta for testing

Wed, 22 Jul 2020 00:38:41 +0200

Rules added for some more packages, and added amsmath rules as part of default rules.

Added ability to count numbers separate from other words using %TC:wordtype specification: still experimental and subject to change. Better parsing of various number formats.

Custom summary output now allows arithmetic combinations of counters.

HTML layout improved, with colour codes moved to end by default.

Several minor issues fixed:

top

Rerelease of version 3.1.1

Sat, 03 Nov 2018 07:10:07 +0100

By mistake, the package TeXcount_3_1_1.zip released here on 28 October contained the old version 3.1 instead of version 3.1.1. This has now been corrected.

top

Release of version 3.1.1

Sun, 28 Oct 2018 10:19:00 +0100

Fixed regular expressions containing unescaped curly-braces: TeXcount fails on recent Perl versions (bug reported for Perl 5.28, but may have started with Perl 5.26) due to this.

The search path for file inclusion has been improved and should now work more like what TeX/LaTeX actually does.

top

Release of version 3.1

Sat, 16 Sep 2017 14:56:18 +0200

Fixed error in help command, and error when detecting terminal width. Added option -all-nonspace-characters to count characters including punctuation.

top

Release of version 3.0.1 for testing

Sun, 02 Apr 2017 01:44:04 +0200

A few bugs have been fixed, mostly special cases where things could sometimes go wrong. File name handling has been modified a bit, and hopefully be more robust now.

Previously, package inclusion macros (eg \usepackage) were not parsed by the same routines as the rest of the document, and were therefore less robustly handled. This has now been changed.

The help texts have been restructured so it is more easy to find help on specific topics, and is not pointing to the documentation more clearly.

Verbose output to the terminal will now try to determine the terminal width rather than just use a default width.

top

Updated TeXcount version (3.0.0.24)

Fri, 11 Apr 2014 13:38:13 +0200

A bug in the interpretation of special letters, e.g. \mu, caused macros like \multi to be mistaken for the word {\mu}lti. This is now fixed.

Also, the option -out-stderr has been added which will channel all output to STDERR. This may be useful when executing TeXcount from some editors which return the STDERR output but not the standard output from TeXcount.

top

Release of TeXcount version 3.0

Mon, 29 Jul 2013 17:15:01 +0200

Finally, version 3.0 of TeXcount is out!

One of the major improvements is in the handling of macro options. Previously, these were ignored, while now rules can be added to count these: e.g. for \item[...], the option is now counted as text.

Another improvement is that the addition of new macro rules no longer uses the old numerical codes, but the rules have been given names. It is also possible to add new counters to count other categories of text.

The verbose output, and the ability to customise this, has been improved. Windows support for colour coded output from the command line has been added, avoiding the need to go via HTML and a browser; I still recommend the HTML output, however, since this is more readable and more customisable.

And finally, there have been more macro rules added, in particular support for more packages, as well as the customary bug fixes where discovered.

top

Update of version 3.0.beta (build 74) available for testing

Tue, 04 Jun 2013 14:38:41 +0800

The version 3.0 release is now much overdue and should come shortly, but to make sure known problems have been fixed and no new ones introduces, here's the latest build of the 3.0β. Mostly, it has some minor fixes. One of the bigger changes is that the default macro option handling rule has been relaxed: it may have been a bit too strict in the past, making TeXcount not recognise them as option and instead processing them as text.

Output of summary output to TeX code has been improved by the introduction of the -tex option. Previously, error messages containing TeX special characters could cause problems when included into TeX code.

A new TeXcount instruction, %TC:insert, has been added for inserting TeX code for TeXcount to process, but which will still be considered a comment by TeX.

top

Version 3.0.beta ready for testing

Wed, 30 May 2012 19:46:39 +0200

The most important change from 3.0α is the addition of several new macro rules, including a number of package specific rules. This adds to the improvements in 3.0α of being able to parse and count optional parameters like \item[...].

The 3.0α version had support added for counting the macros, and users of the web-service trying out 3.0α could (voluntarily) allow the list of macros and packages (no text!) to be logged. This has helped me identify macros and packages that are commonly used and for which parsing rules should be added.

Some bugs introduced in 3.0α have been fixed.

top

Version 3.0.alpha ready for testing

Mon, 09 Jan 2012 16:34:53 +0100

Version 3.0 of TeXcount will have a number of new features and improvements. The α-version is now available for testing. Most likely, there may be bugs left that I have not detected. I hope to catch most errors before the β-release, so please inform me of any errors or unexpected behaviours.

One of the most obvious improvements is that optional macro parameters, i.e. on the form [...], can now be processed by TeXcount: previously, they were all ignored by default.

There are some other improvements in the handling of macro rules, and as part of this headers no longer need separate sets of rules. Macro rules are no longer specified through cryptic numberical codes, but by keywords; the same applies to counters. It is also possible now to add more counters and so e.g. add rules to count footnotes separately rather than together with captions.

The verbose output can now be customised through the -v= option: the user can specify in great detail which elements are included in the verbose output. For HTML output, the user can also provide an HTML template using -htmlfile=, or CSS style through -css= or -cssfile=.

In Windows, colour coded output should now work, although not perfectly: the background colour tends to get changed. However, it is better than not working at all.

Macro rules for file inclusion commands are now more flexible, and support for the import package has been added. There are also new options -out= to print output directly to file, and -auxdir= to specify directory where auxilary output (i.e. the bbl-files) are stored.

TeXcount can now also collect macro usage statistics though the -macrostat option. This was added in part with the web-service in mind: if the user permits it (logging can be unchecked to disallow), TeXcount will log the number of times macros, environments and packages were used. Through this, I hope to capture packages and macros for which I should add parsing rules.

For improve help, options -help-option and -help-style are provided.

Detail are provided in the documentation. In addition to the TeXcount user manual, and the briefer Quick Reference manual, there is now also a Technical Documentation that explains the Perl code of TeXcount in greater detail.

top

Quick bug fix for version 2.3

Sat, 30 Jul 2011 13:18:39 +0800

I have found and fixed an error related to the handling of East Asian languages which might cause problems under some Perl implementations by making TeXcount process e.g. Chinese characters as letters rather than logographic characters (i.e. each character counted as a word).

top

Version 2.3 released

Thu, 28 Jul 2011 23:29:02 +0800

Version 2.3 is finally released after lying about, first as an α version and then as a β, for a long time.

Compared to the 2.3β version, there are some minor fixes. The default for Korean is now to count the characters, not space separated words as before although that is still possible by specifying -korean-words.

Compared to the last major release, however, there are some major changes. The first thing you may notice is that I have changed the HTML output format quite a bit. I the previous versions, the verbose output relied to a large extend on inserting spaces between words, which I have now done away with by boxing in the words.

Language support has been vastly improved, including the flexibility to add more languages. This relies heavily in Unicode, however, which is the recommended encoding for anyone working with non-ASCII character sets.

File processing has also been improved. TeXcount can now be used in a pipe. When working with a main document that includes several subdocuments, it is now possible to merge these included documents into the main document rather than counting them as separate files. It can now also include the bibliography, either by including it in the document or by including the bibliography file.

TeXcount has also been made more customisable in that the user can provide a template for the summary output, and options (including this template) can be provided to TeXcount in a separate options file.

The new version is also substantially faster, which should make a difference on large file: previous version handled large files very inefficiently.

top

Version 2.3.beta build 84 out for testing

Wed, 27 Apr 2011 16:38:46 +0200

A few minor fixes have been done: e.g. verbatim should work better now, HTML output is a little more robust, and I have added support for Korean (although I don't know yet if that works as it should). I have also cleaned up the code a bit which may not only make code maintainence easier, but make it more readable.

top

Release of version 2.3.beta for testing

Tue, 08 Mar 2011 09:04:56 +0100

The numerous 2.3.alpha builds have added several new features. Here's a quick summary:

From version 2.3, TeXcount will use UTF-8 Unicode internally, but can read other file encodings if specified at the command line. With this comes Unicodes support for and annotation of different scripts, which the user can specify in great detail if so desired. The default now is to interpret all text, including e.g. Chinese and Japanese, as countable text.

Word frequency counts may be produced, as well as frequency counts for the different script types which can be quite powerful if the user specifies the scripts used (alphabets and "logogram" sets).

TeXcount can now be used in a pipe, i.e. read from STDIN.

The file inclusion scheme has been improved and TeXcount can now merge included files into the main document. It can also handle the bibliography, although the accuracy of bibliography word counts may be in question.

The summary output can now be specified using a template. And TeXcount can now take options from an option file, so it is possible to maintain one or several option files which may simply be specified on the command line instead of including a long list of options.

There are also a number of bugs and problems that have been fixed, including some macros for which support has been added (or corrected).

Finally, TeXcount now has the ability to handle package specific macro handling rules, although still most rules are defined at the top level which is included by default. If you know of macros or packages for which rules should be added, please notify me.

top

Updated version 2.3.alpha (build 956)

Wed, 23 Feb 2011 07:26:45 +0100

The final alpha update is now here with several changes and new features.

To take the small things first, I have added support for the \verb macro since this often contains code that may confuse TeXcount. The new option -0 is the same as -1, i.e. one line summary output, but ensures that there is no ending line shift. Some of the word rules from -relaxed have been moved to the default parsing mode, and a -restricted parsing mode added which is more strict.

The main change is that TeXcount now accepts more different file encodings, but converts all text to Unicode internally, which has some impact on how words and letters are identified. Previously, the default was to process text as Latin-1 encoded text, with the option -utf8 to switch to UTF-8 encoded Unicode. Now, TeXcount will guess the encoding, or it may be specified using the -encoding option which should permit all encodings recognised by Perl: e.g. Big5, GB2312, etc. Those who only use Latin letters may not note much difference, but writers of e.g. Chinese should find that the support is substantially improved (and some bugs have been fixed).

TeXcount now identifies Chinese, Japanese, Thai and Lao characters as single character words by default, so options -chinese and -japanese are no longer required. There are -chinese-only and -japanese-only options for those who do not want to include e.g. Latin letter words.

There are also now options to specify in detail the scripts (alphabets and character sets) included in word counts using the -alphabet= and -logogram= options. These rely on the Unicode character properties.

The options -stat has been added which produces a word count statistic: the number of words from each letter/character category. This may be useful for texts that contain e.g. Chinese and Latin. By using the -alphabet= and =logogram= options you can specify which script sets it will identify, e.g. to count words using Latin, Greek, Hebrew and Cyrillic letters. This summary statistic is, like the word frequency count, only done globally: i.e. on the sum of all parsed documents.

top

Updated version 2.3.alpha (build 547)

Mon, 17 Jan 2011 09:54:40 +0100

Another alpha update with some changes and improvements. The main new feature is that an option has been added to include the bibliography in the word count; automatic file inclusion should also work with this. The web service interface has been improved to include new features as well.

Another thing is that the old web pages have finally been closed down: this is now the new official TeXcount web site.

top

Updated version 2.3.alpha (build 480)

Mon, 22 Nov 2010 15:12:34 +0100

Alas, in the release of build 434, the latest files had not made it into the zipped package. Here's a new build. I have also tried to improve on the handling of ignored lines when full parsing details is not used.

top

Updated version 2.3.alpha (build 434)

Mon, 15 Nov 2010 17:12:10 +0100

Build 434 contains a number of improvements and fixes, although most of them minor or not immediately apparent.

As a new feature, the option -freq has been added to count word frequencies in the parsed LaTeX documents. This only provides and overall total, not counts per file. In order to limit the displayed counts, you may specify -freq=# where # is the minimum frequency required to be listed in the word count output.

Previously, LaTeX documents included using e.g. \include could be automatically parsed by TeXcount by using the option -inc. The new option -merge allows included files to be merged into the parent document.

In previous versions, spaces were ignored appart from being separators between words and macros. Spaces were then added afterwards, leading to extra spaces being added where none were in the original document. This is now changed so the verbose output should be more similar to the LaTeX code.

The way TeXcount handles documents and parses the code has been improved greatly in order to increase the speed. One part of this is to parse paragraph by paragraph so as to handle only smaller parts of the document at a time.

The macro and group names accepted in %TC instructions to TeXcount has been extended: it was previously too restrictive making it hard to add rules for macros and groups with names containing characters other than letters.

A few minor bugs were fixed.

Use of locale has been discontinued: commented out in the Perl code of TeXcount, but may be reactivated in case it is needed by removing the # in front of use locale.

top

Updated version 2.3.alpha (build 29)

Fri, 23 Jul 2010 19:39:01 +0800

There has been a problem with file names and paths that contain spaces. This should now be fixed and work under both Windows and Linux.

TeXcount can now read a LaTeX document from STDIN (standard input), so it is now possible to use TeXcount in a pipe. Just add the option - to make TeXcount read STDIN. TeXcount reads STDIN just as if it was a file, but after all other files have been read.

top

Version 2.3.alpha being released for testing

Sun, 20 Jun 2010 15:19:25 +0800

Version 2.3 is due soon with some new features:

ability to count characters instead of words;
print results using your own output templates;
get help on parsing rules for specific macros and groups;
preparsing substitution (e.g. for inserting you own path definitions);
specify an option file with your own options and settings.

A few minor changes and fixes are also in place:

handling of some groups (equation*, eqnarray*, align*) did not work and explicit rules for there are now in place (any missing?);
I have added -strict to turn on warning for groups without rules defined;
subcounts are now turned on by default, but can be turned off by -nosub;
ANSI colours are now off by default under Windows (no more need for -nocol).

One major change is the ability to define package specific rules which will be included when the package is included. So far, the code to handle this is in place, but rules have not been implemented. If you have packages you use which are not covered by the ordinary macro rules, please drop me a line. I could need some help both identifying the packages, macros and groups, as well as determining what the rules should be as the macros I tend to use are already covered in the default set of rules.

top

Version 2.2 released

Thu, 30 Apr 2009 06:41:02 +0200

Version 2.2 of TeXcount is now released. The main change is that it now supports UTF-8 (Unicode), allowing TeXcount to be used on characters other than the Latin alphabet. There is also special support for Chinese and Japanese added. A few minor problems have also been fixed.

top

Version 2.1: Quick fix

Sun, 9 Nov 2008 07:38:58 +0200

In the download package, the script had been saved in Windows file format, and needed to be converted to Linux format using dos2unix to make it run on the Linux command line. I have fixed this. The default -sub option has been changed to subsection as specified in the documentation: it was set to section.

top

Version 2.1 released and Web-interface upgraded

Sun, 2 Nov 2008 20:13:28 +0100

Version 2.1 is finally released. It has several improvements over the older version 2.0: subcounts per section, cumulative counts per line, ability to exclude segments of the document from counting, more flexibility in specifying the output (from detailed summary to just a single number), and help on the style and colour used in the verbose output. In addition, there are a few minor fixes. The web-interface has also been upgraded to make use of these features.

top

Version 2.1.beta ready for testing

Thu, 30 Oct 2008 04:24:18 +0100

The main improvement over 2.1.alpha is that help on the colour codes of the verbose output is now provided. Also a few minor fixes and improvements. A final 2.1 release should be ready shortly, which is a clear improvement over 2.0.

top

Version 2.1.alpha ready for testing

Thu, 10 Jul 2008 03:15:11 +0100

This is a preliminary release of version 2.1 for testing purposes, so any feedback would be welcomed. A few problems have been resolved, and some more features added. In particular, subcounts (e.g. by chapter) are now available, a few alternatives for summary output exist (e.g. total sum only), and a relaxed parsing mode has been added that allows more general characters and some macros to be parts of words and macro options.

top

Version 2.0 again

Thu, 12 Jun 2008 23:55:57 +0200

It seems the download version was still the 2.0.beta, not the final 2.0. Differences were minor, but it has been fixed now. I've also improved the download script a little. I plan to do some improvements to the script over the next few weeks, so if there are any suggestions, this is the time. I've added a TODO list on the web page of things I plan to do.

top

Version 2.0 released

Sun, 10 Feb 2008 17:28:40 +0200

After a few minor fixes, I'm releasing version 2.0. Linux users may note that I have changed the first line which gives the Perl command. I've also changed the name of the rule 'exclude' to 'macro' which is more appropriate.

top

Version 2.0.beta released

Thu, 31 Jan 2008 17:15:31 +0100

TeXcount version 2.0 is now ready: I have labeled it a beta version for the time being in case there are any problems I have not discovered. Version 2.0 contains some major improvements over the older versions: in particular the preamble is now handled more properly. It is also possible to define macro handling rules as comments in the tex documents. More, as well as more general, macro handling rules are also possible with this version.

top

News feed for updates!

Thu, 31 Jan 2008 17:08:53 +0100

The TeXcount web site has been upgraded with this RSS news feed. I will use this to announce updates or other news relevant to users of TeXcount.