mirror of
https://github.com/google/mozc-devices.git
synced 2025-11-08 16:53:28 +03:00
Add nazoru-input in mozc-nazoru/
This commit is contained in:
256
mozc-nazoru/README.rst
Normal file
256
mozc-nazoru/README.rst
Normal file
@@ -0,0 +1,256 @@
|
||||
Gboard Physical Handwriting Version
|
||||
===================================
|
||||
|
||||
Gboard Physical Handwriting Version is a device which translates your
|
||||
scribble on your keyboard into a character. You can make your own Gboard
|
||||
Physical Handwriting Version by printing your own printed circuit board
|
||||
(PCB). Also, you can train your own model to recognize a customized set
|
||||
of characters. This repository provides circuit diagram, the board
|
||||
layout and software to recognize your stroke over the keyboard as a
|
||||
character.
|
||||
|
||||
Software Usage
|
||||
--------------
|
||||
|
||||
Input Characters
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. code:: shell
|
||||
|
||||
$ pip install .
|
||||
$ nazoru-input
|
||||
|
||||
By running the commands above, you can make your own machine into an
|
||||
input device which accepts scribbles on the connected keyboard and send
|
||||
characters via bluetooth. At the beginning, this script scans
|
||||
connected keyboards and starts listening to inputs from one of the
|
||||
keyboards. Then it translates a sequence of keydowns into a predicted
|
||||
character considering pressed timings, and send the character to the
|
||||
target device paired by bluetooth.
|
||||
|
||||
If you want to try it for development, you can use ``-e`` option.
|
||||
|
||||
.. code:: shell
|
||||
|
||||
$ pip install -e .
|
||||
$ nazoru-input
|
||||
|
||||
Training Model
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
.. code:: shell
|
||||
|
||||
$ pip install .
|
||||
$ nazoru-training ./data/strokes.zip
|
||||
|
||||
We have a script to generate a trained model which recognizes input
|
||||
characters from scribbles. This script renders input stroke data into
|
||||
images to extract useful features for prediction considering position of
|
||||
the key and timing of keyboard events. Rendered images are fed into the
|
||||
neural network model and the optimizer tunes the model to fit the data.
|
||||
Once the training is done, the script outputs the trained graph, which
|
||||
you can use for your own device. In the case where you install
|
||||
``nazoru-training`` from pip, you can find ``strokes.zip`` at here:
|
||||
https://github.com/google/mozc-devices/mozc-nazoru/data/strokes.zip
|
||||
|
||||
You can change some configurations by passing command line flags (e.g.
|
||||
path to the input/output files, hyper-parameters). Run
|
||||
``nazoru-training --help`` for details.
|
||||
|
||||
Hardware Setup
|
||||
--------------
|
||||
|
||||
Printed Circuit Board
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
Gboard Physical Handwriting Version uses Raspberry Pi Zero for the
|
||||
keyboard input recognition and RN42 module for Bluetooth connection to
|
||||
your laptop. You can check the wiring at ``board/schematic.png``. Also,
|
||||
the original CAD data in EAGLE format is available
|
||||
(``board/nazoru-stack.sch`` and ``board/nazoru-stack.brd``). The board
|
||||
has non-connected pads and connectors for SPI and I2C. The connector
|
||||
itself should be compatible with other Raspberry Pi, but we tested it
|
||||
only on Raspberry Pi Zero W.
|
||||
|
||||
.. image:: ./board/schematic.png
|
||||
:width: 1000px
|
||||
|
||||
Raspberry Pi Setup
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
**Step 0 - Prepare your Raspberry Pi**
|
||||
|
||||
Please prepare your Raspberry Pi, SD card initialized by RASPBIAN
|
||||
image, and RN42 module. Connect your Raspberry Pi with RN42 as the
|
||||
schematic shows. Please make sure you can have access to the internet
|
||||
and also it has enough disk space to install packages on the following
|
||||
steps.
|
||||
|
||||
**Step 1 - Setup UART to RN42**
|
||||
|
||||
If you try it on Raspberry Pi Zero W or Raspberry Pi 3, you need to
|
||||
have additional settings for the serial communication because they
|
||||
equipped a wireless module connected by the UART. See details at `an
|
||||
official document
|
||||
<https://www.raspberrypi.org/documentation/configuration/uart.md>`_.
|
||||
In short, you need to add ``enable_uart=1`` to ``/boot/config.txt`` on
|
||||
your Raspberry Pi.
|
||||
|
||||
**Step 2 - Initial setup for RN42**
|
||||
|
||||
You need to write your initial setup to RN42. At first, install screen
|
||||
and open ``/dev/serial0`` for configuration.
|
||||
|
||||
.. code:: shell
|
||||
|
||||
$ sudo apt install screen
|
||||
$ sudo screen /dev/serial0 115200
|
||||
|
||||
After that, please type the following commands. Note that you need to
|
||||
type ENTER after input commands. For example, please type ``$$$``
|
||||
and ENTER to execute ``$$$`` command.
|
||||
|
||||
1. ``$$$`` : Get into the command mode. The green LED will blink
|
||||
faster.
|
||||
2. ``+`` : You can see what you type.
|
||||
3. ``SD,0540`` : Set the device class to keyboard.
|
||||
4. ``S~,6`` : Set the profile to HID.
|
||||
5. ``SH,0200`` : Set the HID flag to keyboard.
|
||||
6. ``SN,nazoru-input`` : Set the device name as nazoru-input. You
|
||||
can name it as you want.
|
||||
7. ``R,1`` : Reboot RN42.
|
||||
|
||||
You can quit the screen by ``C-a k``.
|
||||
|
||||
**Step 3 - Download and install nazoru-input**
|
||||
|
||||
We provide a service file at ``data/nazoru.service`` to launch
|
||||
``nazoru-input`` when booting. You can install it by uncomment
|
||||
``data_files`` entry in ``setup.py``. Also, before installing this
|
||||
package, We'd strongly recommend you to install some package from apt
|
||||
repository as follows, so that you can install pre-built packages.
|
||||
|
||||
.. code:: shell
|
||||
|
||||
$ sudo apt install git python-pip python-numpy python-cairocffi \
|
||||
python-h5py python-imaging python-scipy libblas-dev liblapack-dev \
|
||||
python-dev libatlas-base-dev gfortran python-setuptools \
|
||||
python-html5lib
|
||||
$ sudo pip install http://ci.tensorflow.org/view/Nightly/job/nightly-pi-zero/219/artifact/output-artifacts/tensorflow-1.6.0-cp27-none-any.whl
|
||||
$ git clone https://github.com/google/mozc-devices
|
||||
$ cd mozc-devices/mozc-nazoru
|
||||
$ sudo pip install . # If you want to develop nazoru-input, please use 'pip install -e .' instead.
|
||||
|
||||
**Step 4 - Enjoy!**
|
||||
|
||||
.. code:: shell
|
||||
|
||||
$ sudo nazoru-input # If you miss sudo, nazoru-input may use a DummyBluetooth object.
|
||||
|
||||
Training Data Format
|
||||
--------------------
|
||||
|
||||
We are providing the raw training data at ``data/strokes.zip``. Once you
|
||||
uncompress the zip file, you will get a ``.ndjson`` file which contains
|
||||
all entries (we call them **strokes**) we have used for training.
|
||||
|
||||
Each stroke entry contains the following field:
|
||||
|
||||
+----------+-----------+-------------------------------------------+
|
||||
| Key | Type | Description |
|
||||
+==========+===========+===========================================+
|
||||
| id | integer | A unique identifier across all strokes. |
|
||||
+----------+-----------+-------------------------------------------+
|
||||
| writer | string | A unique identifier of writer. |
|
||||
+----------+-----------+-------------------------------------------+
|
||||
| kana | string | Label of the character drawn. |
|
||||
+----------+-----------+-------------------------------------------+
|
||||
| events | list | List of keyboard events. |
|
||||
+----------+-----------+-------------------------------------------+
|
||||
|
||||
Each event is a 3-tuple of (``key``, ``event type``, ``time``). ``key``
|
||||
describes the key on which the event happened. ``event type`` describes
|
||||
what type of event happened. It should be "down" (keydown) or "up"
|
||||
(keyup). ``time`` describes the consumed time until the event is fired
|
||||
in millisecond.
|
||||
|
||||
For example, the entry below denotes a stoke of "ほ
|
||||
(\\u307b)" accompanied with a sequence of keyboard events
|
||||
starting from the keydown event on "t" and ending at the keyup event on
|
||||
"l" which was fired 1.005 seconds later after it started recording.
|
||||
|
||||
.. code:: json
|
||||
|
||||
{
|
||||
"id": 5788999721418752,
|
||||
"writer": "ffb0dac6b8be3faa81da320e29a2ba72",
|
||||
"kana": "\u307b",
|
||||
"events": [
|
||||
["t", "down", 0],
|
||||
["g", "down", 40],
|
||||
...
|
||||
["l", "down", 966],
|
||||
["l", "up", 1005]
|
||||
]
|
||||
}
|
||||
|
||||
You can also prepare your own dataset in ``.ndjson`` format and rebuild
|
||||
the model on it. The list of kanas to recognize is in
|
||||
``src/nazoru/lib.py``. You can update that if you want to modify the set
|
||||
of characters.
|
||||
|
||||
Network Structure
|
||||
-----------------
|
||||
|
||||
Data Preprocessing
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Each stroke entry is rendered to a square image before any training
|
||||
runs. The script (``nazoru-training``) renders strokes in various ways
|
||||
to extract useful features. Our default settings extract 10 features
|
||||
from each stroke entry: 8 directional features and 2 temporal features
|
||||
on 16x16 square canvas; this means that the input shape is 16x16x10 by
|
||||
default.
|
||||
|
||||
Convolutional Network
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Rendered inputs are fed into a convolutional neural network designed for
|
||||
this task. Body structure looks like:
|
||||
|
||||
- Convolutional layer (kernel size: 3x3, filter size: 32, stride: 2,
|
||||
activation: Relu)
|
||||
- Separatable convolutional layer (kernel size: 3x3, filter size: 64,
|
||||
stride: 1, activation: Relu)
|
||||
- Separatable convolutional layer (kernel size: 3x3, filter size: 128,
|
||||
stride: 2, activation: Relu)
|
||||
- Separatable convolutional layer (kernel size: 3x3, filter size: 128,
|
||||
stride: 1, activation: Relu)
|
||||
|
||||
For more details about the separatable convolutional layers, please
|
||||
refer to `MobileNet <https://arxiv.org/abs/1704.04861>`__ architecture.
|
||||
|
||||
Authors
|
||||
-------
|
||||
|
||||
Machine Learning:
|
||||
|
||||
Shuhei Iitsuka <tushuhei@google.com>
|
||||
|
||||
Hardwares, system setups:
|
||||
|
||||
Makoto Shimazu <shimazu@google.com>
|
||||
|
||||
License
|
||||
-------
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
not use this file except in compliance with the License. You may obtain
|
||||
a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
Reference in New Issue
Block a user