SciPy: an open source tool for scientific computing. I wrote the maximum entropy module and parts of the sparse matrix module. An earlier version of the the maximum entropy module was available as the ftw Text Modeller on SourceForge.

Question corpus

Big question corpus (307k unique questions in English; filtered, scripts included; 2.8 MB; tar.bz2 format)


BibTeX database of publications in probability and statistics, machine learning, and speech and language technology

Advice to non-native authors of English

Prefer a single word to two: not "carried out" but "conducted" or "performed".

Write in the active voice: not "A comparison of single-carrier versus multicarrier has been found to be obsolete" but "We have compared single-carrier and multicarrier transmission and found" or "Our comparison … has found".

Do not use commas before subordinate causes: not "it was clear, that" but "it was clear that". One exception to this rule is that parenthetical remarks should be enclosed by two commas: "We conjecture that, with a wide-band signal and mild background noise, a speech recognizer may …"; "Branch-and-bound methods cannot, except in special cases, reduce the worst-case time complexity …" To use one comma in a parenthesis is false. "Dr Holmes, a detective is …" should be "Dr Holmes, a detective, is …"

Apply Occam's razor. Prefer "and" to "as well as"; "with" to "together with"; "arising from" to "that particularly arise due to the"

Delete "very". "Is crucial", "is important", "is vital", "is interesting" require no "very". Use "is nearly complete" rather than "is very close to completion".

German speakers tend to use too many hyphens. Do not hyphenate "xDSL performance", "multimedia technologies".

Avoid "e.g.", "i.e.". Leave them out or integrate them into the main text with, for example, "such as".

