An ASR Bundle's on-disk size is typically be between 30MB and 60MB, depending on which acoustic model is used. When compressed, the corresponding size would be approximately 20MB - 45MB, which is how much the app size would increase if the ASR Bundle were embedded in your app.

If size is of concern, you can have the app download the ASR Bundle when the user launches it for the first time instead of including the ASR Bundle in the app itself.

Keen Research currently provide ASR Bundles for English and Spanish. We can provide ASR Bundles upon request for most major spoken spoken languages within 6-8 weeks.
We provide English ASR Bundle optimized for kids voices; performance is equivalent to adults on comparable test sets. A number of our customers use KeenASR SDK in apps for children.
Make sure that the decoding graph is recreated after you updated the list of phrases. The SDK allows you to check if the decoding graph exists in the file system, so that you don't have to create it every time the app is started. However, when you change the input to the decoding graph creation methods (for example, by augmenting or modifying the list of phrases during development) you need to make sure that the decoding graph is recreated using the updated input data. In other words, make sure you are not skipping the call to create the decoding graph simply because it has been created in the past.

For the most part, no, you cannot (or should not) control the audio stack. The KeenASR SDK assumes control of the audio stack on most platforms. This way we can guarantee that the audio is captured in exactly the way the KeenASR SDK expects it. We typically initiate the audio stack for the most general use and then provide callbacks that allow you to set up other SDKs or audio modules in a way that will not interfere with our SDK. For more information, see the Audio Handling section on the Getting Started page.

If you have a specific use case that does not match current SDK capabilities, drop us a line.

Keen Research recommends you log complete recognition results in partial and final callbacks. It is useful to display this information on the screen during debugging and early user testing. Even if you don't plan to show this information in the final product, it will help you catch errors early on.

Review all console/logcat outputs as well as log messages with warning or error levels.

Use visual and audio indicators for when the app is actively listening; this is a good general UX practice, not just during debugging.

If you are conducting user testing, you can connect your app to the Dashboard and automatically send audio data and recognition results to the cloud for further analysis.

For assessing speech recognition performance, the best approach is to collect a small amount of test data. You can run the SDK against this test data set in a controlled manner to evaluate how well the recognition works.

The answer depends on several factors: (1) the size of your language model, (2) the amount of available CPU and RAM on your device, and (3) how often you need to create decoding graphs. If you are dealing with more than 10,000 words, or if the content you are using to create the decoding graph is large (> 25,000 phrases) you will probably want to work with us and create decoding graphs ahead of time in your development sandbox.

Note that the decoding graphs created on the device are not using the rescoring approach in decoding. Also note that this answer relates to the process of creating decoding graphs, which is typically done once.

Consider this answer as a baseline; ultimately, you will need to validate your approach on devices that will be used in a production setting.

Memory footprint is driven by the size of the deep neural network acoustic model and by the size of the decoding graph (primarily by the language model that was used to build the decoding graph).

CPU utilization will also depend on the size of the model (there is a fixed CPU processing burden related to the size of the acoustic model; for each frame of the audio, audio features are pushed through the deep neural network). The other factor affecting CPU utilization is graph search, which depends on the size of the graph (size of the language model) as well as on various configuration parameters.

For medium vocabulary tasks (such as searching a movie library with ~7000 titles), the memory footprint for an SDK with the standard ASR Bundle would be 100-150MB. CPU utilization would be approximately 40% of a single core on the iPhone 6s.

Keen Research is working on a number of optimizations for mobile devices that will significantly reduce memory footprint and CPU utilization.
Yes, as of version 1.5 you can disable notification handling in the SDK by setting KIOSRecognizer's handleNotifications property to NO. You will still need to make sure audio interrupts are properly handled and the audio stack is deactivated and activated as necessary (for example, when a phone call comes through) by using the activateAudioStack and deactivateAudioStack KIOSRecognizer methods.
For mobile apps, licensing is typically structured as a yearly recurring fee per product you want to voice-enable. Depending on the type of your product or app, this structure may vary and can include revenue sharing, per device installation fees, or a one-time fee).

Keen Research's pricing and licensing structure depends on a number of factors; we don't have a single pricing model for SDK licensing. To start the conversation, let us know what you are building and how you plan to use our SDK.

A KeenASR license includes SDKs for iOS and Android, access to ASR Bundles trained in-house (including support for additional languages), access to new SDK releases, and the use of the Dashboard service during the app development phase. There are no limits on the number of installations, number of users, usage, etc.
Ken Research is not providing software development services, so most likely the answer is no.

Through our Professional Services Agreement, we can assist with proof-of-concepts (these will typically be rudimentary apps with focus on voice user interface), evaluation of a specific task domain, integration of the SDK in your app, training of custom acoustic and language models, and porting to custom hardware platforms.
Please check this page for potential problems with line endings in text files that can happen with Github on Windows.
Make sure you are building an app for a real device, since KeenASR SDK will not run on simulator.

Another reason may be that you cloned the iOS PoC project from Github without having git-lfs installed on your development box. Due to the large size of the KeenASR framework, some of the files in the iOS KeenASR framework are managed via git-lfs. If git-lfs is not installed, when you clone the library you will end up with a git-lfs reference file (a small text file) instead of a binary file. You can check the size of the library file (KeenASR.framework/KeenASR); if properly checked out it will be 100MB or so, if should not be very small (few thousand bytes).

The library in the iOS PoC app is not properly checked out if you just download the project zip file from Github.