ChipFind - документация

Электронный компонент: 65-0087

Скачать:  PDF   ZIP
Voice ExtremeTM IC
Speech Recognition Controller
Data sheet
2001 Sensory Inc.
P/N 80-0204-C
1
Description
The Voice ExtremeTM IC (VE-IC), from the
Interactive SpeechTM family of products, is an 8-bit
microcontroller designed specifically for speech
applications in consumer electronic products.

Combined with a 2MB external flash memory, the
VE-IC offers the flexibility of a microcontroller with
advanced speech technology, including high-quality
speech recognition, speech and music synthesis,
speaker verification, and voice record and playback.
Products can use one or all of the VE-IC features in
a single application.

The VE-IC supports Sensory SpeechTM 6, the latest
speech recognition technology from Sensory, which
includes a number of new techniques that
significantly improve recognition performance over
previous versions. Using sophisticated neural
network technology, on-chip speech recognition
algorithms reach an accuracy of greater than 97%
for speaker-independent recognition and greater
than 99% for speaker-dependent recognition.

The VE-IC can be purchased in DIE or TQFP
packages, or fully assembled as a part of the VE
Module
. A low-cost development system, the VE
Toolkit
, contains all software and hardware to create
voice activated products.
Features
Full Range of Sensory SpeechTM 6 Capabilities
Speaker-independent speech recognition
Speaker-dependent speech recognition
High quality speech synthesis and sound effects
Speaker verification
Three-voice music synthesis
Voice record & playback
Integrated Single-Chip Solution
4 MIPS 8-bit microcontroller
On-chip A/D and D/A converters, pre-amplifier,
and AGC
32kHz clock for time keeping (DIE package only)
Secondary Timer 2
14 I/O lines
RS-232 serial interface (uses three I/O)
16-bit external memory bus
24x24 Multiplier for rapid recognition processing
Low Power Requirements
2.85 5.25V operation
~10mA operating current at 3V
Power down mode; <5
A standby current
The Voice ExtremeTM System
Built-in VE-C Interpreter (a subset of ANSI-C)
Built-in Dynamic Memory Handler
Built-in Speech Technology Firmware
Voice ExtremeTM IC Block Diagram
2Mb
FLASH
DYNAMIC
MEMORY
HANDLER
AGC
M
I
CR
O
P
HO
NE
VE-C
INTERPRETER
SPEECH
TECHNOLOGY
FIRMWARE
SPEECH
PROCESSING
UNIT
CPU
PWM
VE-IC
SP
EA
K
E
R
ADC
DAC
MIC
PREAMP
I/O
RS232
I/O
RS232
AMPLIFIER
Voice ExtremeTM IC
Data sheet
2
P/N 80-0204-C
2001 Sensory Inc.
VE-IC architecture
The VE-IC is a highly integrated device that
combines:
8-bit microcontroller
On-chip VE-C Interpreter
2 Mb Flash memory manager
RAM (2.5 Kbytes)
A/D converter and D/A converter
Input amplifier and pulse width modulator

The VE-IC has an external memory interface, with
16-bit addresses and 8-bit data buses, for
accessing external 2 Mb Flash memory.

Two bi-directional ports provide 14 general-
purpose I/O pins to communicate with external
devices (RS232 uses three I\O). The VE-IC has a
high frequency (14.32 MHz) oscillator as well as a
low frequency (32,768 Hz) oscillator suitable for
timekeeping applications (available in Die version).
The processor clock can be selected from either
source, with a selectable divider value. The device
performs speech recognition when running at
14.32 MHz. There are two programmable 8-bit
counters / timers, one derived from each oscillator.
Speech recognition
The VE-IC uses a neural network to perform speaker-independent or speaker-dependent speech recognition
and uses the external flash memory to store speech recognition information. The VE-IC has several additional
speech recognition features as described below.
Continuous Listening
Continuous Listening allows the chip to continuously listen for a specific word. With this feature a product can be
used in a normal environment and only "activates" when a specific word, preceded by quiet, is spoken.
Speech and music synthesis
The VE-IC provides high-quality speech synthesis by using a hybrid of a time-domain compression scheme that
improves on conventional ADPCM and a customized reuse of sounds. Speech synthesis uses the external flash
to store audio sounds for synthesis.
The VE-IC provides high-quality, low-cost three-voice music synthesis which allows multiple, simultaneous
instruments for harmonizing; it uses a MIDI-like system to generate music.
Record and playback
The VE-IC can perform audio record and playback at various compression levels depending on the quantity and
quality of playback desired. Data rates of under 14,000 bits per second are achievable while maintaining very
high quality reproduction. VE-IC also performs silence removal to improve sound quality and reduce memory
requirements.
Speaker verification
The VE-IC can also perform text-dependent speaker verification. After a speaker trains the chip on a specific
word, the chip is able to identify whether that word is spoken by the original speaker, thus providing biometric
security.
Word spot
Word Spot provides the ability to recognize trigger words embedded in continuous speech; thus the password
sequence "Robert Henson" could be recognized if spoken as "My name is Robert John Henson". WS can only
be used with the Speaker Dependent technology; thus it always requires a training phase.
PRE-AMP
2Mb FLASH
MEMORY
INTERFACE
CPU
TIMING AND
CONTROL
A[15:0]
D[7:0]
-RDC
-WRD
-RDD
-WRC
-RESET
AiFE1
ANALOG
CONTROL
ADC
TIMER1
TIMER2
POR
T
0
DACOUT
XI1, XO1
XI2, XO2
DAC
PULSE
WIDTH
MODULATOR
AOFE3
AiN1
AOFE2
AOFE1
AiN
AiFE2
PWM
BUFOUT/
SPEECH
PROCESSING
UNIT
2K TECHNOLOGY
SRAM
STACK SPACE
(8 levels)
REGISTER SPACE
(448 bytes)
BREAK POINT
REGISTER
INT
E
RR
UP
T
L
O
G
I
C
VOICE EXTREME
INTERPRETER
(INTERNAL ROM)
PDN
P1.0-P1.7
P0.0-P0.7
POR
T
1
I/O Notes:
P0.0 = RS232 data IN (RCV)
P0.1 = RS232 data OUT (XMT)
P1.7 = Serial port enable
P0.5 = A16 (can be shared with Keypad only)
P0.6 = A17 (can be shared with Keypad only)
OSC1
OSC2
A[16:17]
Data sheet
Voice ExtremeTM IC
2001 Sensory Inc.
P/N 80-0204-C
3
Using the VE-IC
Creating applications using the VE-IC requires the development of electronic circuitry, software code, and
speech/music data files.
Software code for the VE-IC can be developed using the VE-C Language (a subset of ANSI-C). The Voice
ExtremeTM IDE offers a friendly environment for developing Voice ExtremeTM. For more information about the
Voice ExtremeTM development tools, please contact Sensory or Visit the web site www.voice-extreme.com.
The following sample circuit provides an example of how the VE-IC might be used in a consumer electronic
product.
Sample Application Circuit (Die)
Microphone Sensitivity
RX determine the microphone sensitivity, by default the microphone gain is pre-set to a level suitable for arms-
length user interfaces with a 2.2K resistor at RX and 4.7nF capacitor at CX. If a different microphone gain is
desired, select the values of RX and CX from the table below:
R4
C28
Microphone Note
1K
10nF Close range or headset
1.5K 6.8nF
2.2K 4.7nF
2.7K 3.3nF Arms
length
3.9K 2.7nF
4.7K 2.2nF Distance
Voice ExtremeTM IC
Data sheet
4
P/N 80-0204-C
2001 Sensory Inc.
2Mbit Flash Memory
This memory is required on the VE-IC and all VE applications. Because of the powerful dynamic memory
handler of VE system software, this Flash is designed to store the application code, speaker independent weight
sets, speech templates, record and playback data, program data, and music data.
These are the 2Mbit flash supported (for further information please refer to manufacturer documentation):
Manufacturer
+5 V
dd
+3 V
dd
SST 29EE020
29LE020
29LV020
WINBOND 29C020 29V020
4X5 Matrix Keypad Support
The VE-IC supports a 4x5 keypad that can be controlled using functions built into the VE-C language.
When the keypad is scanned, the columns are driven (active low), the rows are sensed (pulled high) and all
previous configuration and output values for these pins are saved and restored.
The keypad I/O pinouts are as follows:
Pin
P0.5 P1.5 P0.6 P1.6 P0.2
P0.3
1 2 3 A E
P1.3
4 5 6 B F
P0.4
7 8 9 C G
P1.4
* 0 # D H
General Purpose I/O
The VE Module has 14 general-purpose I/O pins. Each line can be programmed as an input with a weak pull-up
resistor (~150k ohm), input with a strong pull-up resistor (~10k ohm), input without pull-ups, or as an output.

Note:
If an application is stand-alone (once you download the program via asynchronous serial I/O), the two
serial I/O pins, P0.0 and P0.1, and the serial port enable, P1.7, may be used for other purposes.

Since I/O pins P0.5 and P0.6 are connected to the address bus of the Flash memory, they can be used
only for the matrix keypad; they should not be used under any other circumstances since they are
allocated as Flash address lines.
Power
The typical operating current is 10 mA operating at 14.32 MHz and 3V. Lowering clock frequency reduces power
consumption, although speech recognition requires a 14.32 MHz clock. Standby current is <5
A in power down
mode.
Oscillators
Two independent oscillators in the VE-IC provide a high-frequency clock and a 32kHz time-keeping clock. Both
oscillators work with an external crystal, a ceramic resonator or LC.
The oscillator characteristics are:
Oscillator #1 Oscillator #2
Pins
XI1 and XO1 XI2 and XO2
Frequency 14.32 MHz
32768 Hz
Notes
Available only with DIE package
Clock
The VE-IC uses a fully static core the processor can be stopped (by removing the clock source) and restarted
without causing a reset or losing contents of internal registers. Static operation is guaranteed from DC to 14.32
MHz.
Preamplifier
The on-chip preamplifier circuit consists of three stages with a maximum overall gain of about 500. The amplifier
includes a Vref input that is used to set the amplifier center voltages and must be driven by a low impedance
voltage supplied by an external source. The signal inputs of all stages have an 80 K
input impedance to the
Data sheet
Voice ExtremeTM IC
2001 Sensory Inc.
P/N 80-0204-C
5
Vref pad. In a typical design, AOFE1 would be directly coupled to AIFE2, and AOFE2 would be capacitively
coupled to AIN0 through an RC lowpass filter to remove DC offset and digital noise. AOFE3 would be bypassed
to Vref with a small (220pF) capacitor for additional noise suppression.
Analog output
The VE-IC offers two separate options for analog output. The DAC (Digital to Analog Converter) output provides
a general-purpose 10-bit analog output that may be used for speech output (with the inclusion of an audio
amplifier), or other purposes requiring an analog waveform.
For speech applications that require driving a small speaker, the PWM (Pulse-Width Modulator) output can be
used instead of the DAC output. The PWM output can directly drive a 32 ohm speaker.
Serial (RS-232) Communication
Serial communication at the application level in Voice ExtremeTM are always
performed at 9600 baud, 8 data
bits, no parity, 1 stop bit (the download operation runs at 115Kbaud).
If an application is stand-alone (once you download the program via asynchronous serial I/O), the two serial I/O
pins, P0.0 and P0.1, and the serial port enable, P1.7, may be used for other purposes.