Sentence Analyzer - enabling business and enterprise applications to handle sentences and text

The genesis and design principles story.

While working on the document-extraction demo at text2data.net, the repititive nature of sentences in most officialese documents got me thinking on a simple way to handle sentences.

Sentences seem to form structural as well as content boundaries in a document. Any information extraction process becomes doubly powerful once sentence structure in included among other things.

My approach to sentencs is essentially document-centric, the identification of a document, the parsing of it using structural or linguistic boundaries etc.

Existing NLP practices seemed to me too inflexible, incapable of consuming a wide range of sentences, and of outputting the results in a way that business users would find friendly enough.

I also felt that any software using sentences ought not to inflict upon the user any other learning curve except other sentences.

The basic rule I have followed throughout is to go wherever grammar (i.e. whatever little I learned at high school) leads, and not depend on maths and statistics. So that if needed, the real grammar experts could improve the whole thing simply by using better grammar rules.

That way, I hope to include stronger and better grammar rules in the next version, as soon as I can get hold of an expert.

The key design principle is KISS, do the least minimum, keep many things configurable etc.

More view/try pages here.
Generic use cases
Back to home/basic analyzer
Comparing sentences, several modes
Find/search/sort/filter
Crunching a big text
Wildcard usages in pure structure mode
Business use cases
Handling Notes section in annual reports
Crunching of a Presidential speech
Executive profiles
Project statuses
USPTO events alerter
Customer reviews
Back to home/basic analyzer

More reading for those interested...

1. Things that are not obvious from the demo
2. Business products and possibilities
3. So what !! Universal grammar has been in use for decades now ...
4. The inevitable comparisons, to what already exists out there.
5. What is a sentence, to future application builders ?
6. The genesis and design principles story
7. Arbitrary listing of business usages
8. Extensions, additions, customizations possible in the toolkit
9. Important : Combining with the document extraction tool at text2data.net, benefits
Contact at : kinshuk_in @ yahoo dot. com