Print Math to Braille Project
Science and Engineering Division
of the National Federation of the Blind
Greetings:
The Print Math to Braille Project is an effort to encourage
recognition solutions to print math notation as well as a two way
translator between the semantic math language latex and the
Nemeth Braille Code. By far the more challenging of the two engineering
tasks is recognition of print math notation. Getting resources
on this problem is the primary focus of the project.
I have conducted a literature survey and been in contact
with Jim Fruchterman.
I have written a proposal to a commercial vendor and approached a
professor doing research in the area. Of course, the project is
under the invaluable guideance of Brian Buhrow, Chairman, Research
and Development Committee of the National Federation of the Blind.
I will keep you posted with the project's progress on the recognition
front.
The translator between latex and Nemeth Braille Code is a pretty
straight forward software problem. It should take one programmer
less than a year for a solid solution. I wrote a passable solution in
Pascal in 1991.
Because of the recent interest in translation software
between latex and Nemeth Braille Code on the nfb-se@nfbcal.org list,
I believe we should start a discussion on what we want the translator
to do. I strongly believe there should be a commercial solution with
good technical support. Let's work together to define the problem
and lay the groundwork to get a commercial poduct out there
in the near future.
What You Can Do
1) print math recognition
Collect sample typeset math notation that you come across in your
schoolwork or job. Send me a xyrox of the title page and the
sample pages of the text. Make the copy quality as good as possible
since recognition is very sensitive to xyrox noise. The experiment
may require obtaining the original text to achieve best results.
The Print Math to Braille Project is collecting sample math notation
to assist in training and assessing of state-of-the-art recognition
technology.
2) translator between latex and Nemeth braille code
Collect electronic Nemeth braille code and latex equations.
Separate them into typical examples you use and standard equations
in the literature. The Print Math to Braille project will collect
databases of these equations for software design and test.
You can reach me at jamiller@qualcomm.com or write me at:
John Miller
8720 Villa La Jolla Drive #118
La Jolla, CA 92037
wk fax : (619) 658-1585
Here is the problem statement. Happy reading!
Why Scan Math
Print is the public medium of choice. It is compact and instantly accessable.
This is particularly true for desk top publishing mathematic notation. It is
straightforward to generate print hard copy and to disseminate the hardcopy to a
diverse community. The increasing reliance upon computers for exchanging
documents electronically will not stop the distribution of documents in print.
If you ask for a paper on a particular subject, you can expect to get it in
print. Even if an electronic copy of the paper exists, finding its location
and acquiring the knowhow to get it may take weeks. Whether an English student given a
handout in Calculus, or an engineer given an industry standard, the information
seeker will find the information in print.
For the sighted hard copy print presents math notation so that it is easy to read
and quick to reference. For the blind the parallel medium is the Nemeth braille
code. To really understand an article with dense math notation you need to know
it like the back of your hand. The only way to do that is to read it from start
to finish and to revisit referenced equations as you go. The only way to do
this as a blind person is to read the article in braille. The blind person can have a
human reader read the document and transcribe it into brl himself, or he can ask
a Nemeth braille transcriber to prepare the document in braille.
Math braille transcription is a time-consuming and costly process. Most
transcribers request 6 months to prepare a document. A delay as little as a week
is possible but the demand is higher than the supply. Math braille typically
runs 2 to 3 braille pages per print page. A standard rate for math braille
transcription is $2.50 per braille page. For example, the typical cost for a 20
page math article to be transcribed into Nemeth braille code is $125.
Most Nemeth braille transcribers started transcribing during an era of
volunteerism back in the 1950's and early 1960's. The spirit of volunteerism was
high and the economic conditions were such that it was common for a person
to work full time as a volunteer. The economic times have changed and few
braille transcribers swell the ranks of the transcribers who have been at work
for more than 30 years.
The blind need an automated process to convert print math into Nemeth braille
code. The lack of instant access to Nemeth braille code documents in the
workplace and at school is a huge barrier today to the blind in the field of
science. A number of volunteers stand ready to proofread and edit a Nemeth
braille code draft of the quality computer software could generate. In addition,
the instant access to print math notation with an acceptable percent of errors
would help put blind scientists on an even footing with sighted peers.
The print math notation to Nemeth braille code process consists of two parts.
The first part is a print math notation recognition engine. The recognition
engine should support only typeset print. While handwrittten math notation
recognition would be useful, today's technology cannot reliably recognize
handwriting but could reliably recognize typeset mathematics. The recognition
engine should convert math notation into a standard electronic text
representation. This is especially important so that software vendors can
develop programs to convert text stored in this standard into Nemeth braille
code. The electronic representation of mathematics should be in tex, an ASCII
text mathematical notation language that scientists have used to exchange
documents electronically for the last 20 years. The best representation is
latex, a superset of the tex typesetting language. The scientific community will
continue to support tex and develop translation software for tex into other
publisher formats such as HTML 3. In this way it should be straight forward
to convert texts into other electronic standard formats as well.
The print math notation recognition engine should sell for 150 to 300 dollars
but initially might sell for a much higher price. Its potential number of customers is vast.
Anyone in the scientific community would benefit by using this program to include an
exerpt of math notation from a print article directly into a paper or handout.
The second part of the print math notation to Nemeth braille code process is
translating from the electronic notation standard tex into Nemeth braille
code. This program has a very small market. I estimate that 200 customers would
buy the program initially and perhaps much less. Blind students would ask
universities to make it available at college libraries and would purchase
personal copies to help with studies; perhaps 50 blind professionals would buy
it to use in their work, and 50 Nemeth braille transcription groups would
purchase it as well. Because of the small market, an initial price of $500 to
$1000 would be reasonable. A translation program from latex into Nemeth braille
code formatted for hard copy braille offers the blind the best instant access
to math notation. The program should be robust enough
to support all latex math notation. The software program would
make available in braille all electronic math texts. It would provide access
to print through the math notation recognition engine and as publishers
make available the electronic versions of texts, it also would provide access to
published texts free from recognition errors.
SAMPLE MATH NOTATION
PRINT MATH TO BRAILLE PROJECT
SCIENCE & ENGINEERING DIVISION
OF
THE NATIONAL FEDERATION OF THE BLIND
July 29, 1996
A prototype solution for the Print Math to Braille Project should recognize the enclosed
sample Math notation. The Nemeth Braille Code published by American Printing House
contains math notation and its translation to Braille. This text is an excellent starting point
for locating different kinds of print math notation. The average user will want to recognize
notation found in a calculus book as well as math that occurs in the
professional field.
Sample notation includes:
1. Calculus by James Stewart, pages 108 and 109 containing the definition of
the derivative.
2. Field and Wave Electromagnetics by David K. Cheng, pages 323 and 324
demonstrating Maxwell's equations in differential and integral form.
3. Discrete Time and Signal Processing by Oppenheim and Schafer, Pages 45, 526,
611, 730 and 777 demonstrating the Fourier transform and symbols with tildes and hats
and superscripts and subscripts.
4. EVRC, a standard document from the telephone industry that contains dense
math notation.
------------------------------
John Miller, President
Science and Engineering Division
of the National Federation of the Blind
E-mail: jamiller@qualcomm.com
Phone: (619) 658-2689
------------------------------
This archive was generated by hypermail 2b29 : Sat Mar 02 2002 - 01:40:29 PST