193 lines
5.6 KiB
Text
193 lines
5.6 KiB
Text
Sphinx-4 Speech Recognition System
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Version: 1.0Beta6
|
|
Release Date: March 2011
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
New Features and Improvements:
|
|
|
|
* SRGS/GrXML support, more to come soon with support for JSAPI2
|
|
* Model layout is unified with Pocketsphinx/Sphinxtrain
|
|
* Netbeans project files are included
|
|
* Language models can be loaded from URI
|
|
* Batch testing application allows testing inside Sphinxtrain
|
|
|
|
Bug Fixes:
|
|
|
|
* Flat linguist accuracy issue fixed
|
|
* Intelligent sorting in paritioner fixes stack overflow when tokens
|
|
have identical scores
|
|
* Various bug fixes
|
|
|
|
Thanks:
|
|
|
|
Timo Bauman, Nasir Hussain, Michele Alessandrini, Evandro Goueva,
|
|
Stephen Marquard, Larry A. Taylor, Yuri Orlov, Dirk Schnelle-Walka,
|
|
James Chivers, Firas Al Khalil
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Version: 1.0Beta5
|
|
Release Date: August 2010
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
New Features and Improvements:
|
|
|
|
* Alignment demo and grammar to align long speech recordings to
|
|
transcription and get word times
|
|
* Lattice grammar for multipass decoding
|
|
* Explicit-backoff in LexTree linguist
|
|
* Significant LVCSR speedup with proper LexTree compression
|
|
* Simple filter to drop zero energy frames
|
|
* Graphviz for grammar dump vizualization instead of AISee
|
|
* Voxforge decoding accuracy test
|
|
* Lattice scoring speedup
|
|
* JSAPI-free JSGF parser
|
|
|
|
Bug Fixes:
|
|
|
|
* Insertion probabilities are counted in lattice scores
|
|
* Don't waste resources and memory on dummy acoustic model
|
|
transformations
|
|
* Small DMP files are loaded properly
|
|
* JSGF parser fixes
|
|
* Documentation improvements
|
|
* Debian package stuff
|
|
|
|
Thanks:
|
|
|
|
Antoine Raux, Marek Lesiak, Yaniv Kunda, Brian Romanowski, Tony
|
|
Robinson, Bhiksha Raj, Timo Baumann, Michele Alessandrini, Francisco
|
|
Aguilera, Peter Wolf, David Huggins-Daines, Dirk Schnelle-Walka.
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Version: 1.0Beta4
|
|
Release Date: February 2010
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
New Features and Improvements:
|
|
|
|
* Large arbitrary-order language models
|
|
* Simplified and reworked model loading code
|
|
* Raw configuration and and demos
|
|
* HTK model loader
|
|
* A lot of code optimizations
|
|
* JSAPI-independent JSGF parser
|
|
* Noise filtering components
|
|
* Lattice rescoring
|
|
* Server-based language model
|
|
|
|
Bug fixes:
|
|
|
|
* Lots of bug fixes: PLP extraction, race-conditions
|
|
in scoring, etc.
|
|
|
|
Thanks:
|
|
|
|
Peter Wolf, Yaniv Kunda, Antoine Raux, Dirk Schnelle-Walka,
|
|
Yannick Estève, Anthony Rousseau and LIUM team, Christophe Cerisara.
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Version: 1.0Beta3
|
|
Release Date: August 2009
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
New Features and Improvements:
|
|
|
|
* BatchAGC frontend component
|
|
* Completed transition to defaults in annotations
|
|
* ConcatFeatureExtrator to cooperate with cepwin models
|
|
* End of stream signals are passed to the decoder to fix cancellation
|
|
* Timer API improvement
|
|
* Threading policy is changed to TAS
|
|
|
|
Bug fixes:
|
|
|
|
* Fixes reading UTF-8 from language model dump.
|
|
* Huge memory optimization of the lattice compression
|
|
* More stable fronend work with DataStart and DataEnd and optional
|
|
SpeechStart/SpeechEnd
|
|
|
|
Thanks:
|
|
|
|
Yaniv Kunda, Michele Alessandrini, Holger Brandl, Timo Baumann,
|
|
Evandro Gouvea
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Version: 1.0Beta2
|
|
Release Date: February 2009
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
New Features and Improvments:
|
|
|
|
* new much cleaner and more robust configuration system
|
|
* migrated to java5
|
|
* xml-free instantiation of new systems
|
|
* improved feature extraction (better voice activity detection, many bugfixes)
|
|
* Cleaned up some of the core APIs
|
|
* include-tag for configuration files
|
|
* better JavaSound support
|
|
* fully qualified grammar names in JSGF (Roger Toenz)
|
|
* support for dictionary addenda in the FastDictionary (Gregg Liming)
|
|
* added batch tools for measuring performance on NIST corpus with CTL files
|
|
* many perforamnce and stability improvments
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Version: 1.0Beta
|
|
Release Date: September 2004
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
New Features:
|
|
|
|
* Confidence scoring
|
|
* Posterior probability computation
|
|
* Sausage creation from a lattice
|
|
* Dynamic grammars
|
|
* Narrow bandwidth acoustic model
|
|
* Out-of-grammar utterance rejection
|
|
* More demonstration programs
|
|
* WSJ5K Language model
|
|
|
|
Improvements:
|
|
|
|
* Better control over microphone selection
|
|
* JSGF limitations removed
|
|
* Improved performance for large, perplex JSGF grammars
|
|
* Added Filler support for JSGF Grammars
|
|
* Ability to configure microphone input
|
|
* Added ECMAScript Action Tags support and demos.
|
|
|
|
Bug fixes:
|
|
|
|
* Lots of bug fixes
|
|
|
|
Documentation:
|
|
|
|
* Added the Sphinx-4 FAQ
|
|
* Added scripts and instructions for building a WSJ5k language model
|
|
from LDC data.
|
|
|
|
Thanks:
|
|
|
|
* Peter Gorniak, Willie Walker, Philip Kwok, Paul Lamere
|
|
|
|
-------------------------------------------------------------------
|
|
Version: 0.1alpha
|
|
Release Date: June 2004
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Initial release
|