Jekyll2022-02-11T20:16:33+00:00https://rianadon.github.io/blog/feed.xmlPaper StacksA blog by rianadonrianadonCompiling KDE connect on an M1 Mac2022-02-11T00:00:00+00:002022-02-11T00:00:00+00:00https://rianadon.github.io/blog/2022/02/11/compiling-kde-connect-on-an-m1-mac<p><a href="https://kdeconnect.kde.org/">KDE Connect</a> is a beautiful app that syncs non-Apple devices to your Mac. The only problem is that building it on a Mac is a little … intense. Especially on arm64 / Apple Silicon architecture.</p>
<p>So I thought I’d write up some instructions in case anyone else runs into the same issues.</p>
<p>It happens that the build infrastructure, which uses the <a href="https://community.kde.org/Craft">Craft</a> build system, is set up for x86_64 architecture. When it downloads packages from the internet to speed up the build, it downloads the x86 versions. And when there is no download and your Mac compiles programs, it (sometimes) does so in the arm64 architecture, and these two cannot be linked!</p>
<p>In fact, you might get an error like this when Craft compiles dbus:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>building for macOS-arm64 but attempting to link with file built for macOS-x86_64
</code></pre></div></div>
<h1 id="the-fix">The fix</h1>
<p>An arm64 build would be nice; however, I could not figure out how to make QT libraries compile for arm64. So the solution is to force a build for x86_64.</p>
<p>Specify <code class="language-plaintext highlighter-rouge">CFLAGS</code> and all variations to add the <code class="language-plaintext highlighter-rouge">-arch x86_64</code> flag so any <code class="language-plaintext highlighter-rouge">clang</code> build command builds for x86:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CFLAGS</span><span class="o">=</span><span class="s2">"-arch x86_64"</span> <span class="nv">CXXFLAGS</span><span class="o">=</span><span class="s2">"-arch x86_64"</span> <span class="nv">OBJCFLAGS</span><span class="o">=</span><span class="s2">"-arch x86_64"</span> craft kde/applications/kdeconnect-kde
</code></pre></div></div>
<p>Make sure of course you have installed Craft and sourced your environment file (you can follow the instructions on the <a href="https://community.kde.org/KDEConnect/Build_MacOS#Setting_up_Craft_environment">Building KDE Connect on MacOS page</a>).</p>
<p>If you are using Homebrew, you might have conflicts with Craft trying to use some of your arm64 libraries, since both the Craft libraries and Homebrew libararies are linked with pkg-config. I had to rename gcrypt and libepoxy temporarily as the build was running, since these are optional for some of the KDE Connect dependencies:</p>
<ul>
<li>Rename <code class="language-plaintext highlighter-rouge">/opt/homebrew/opt/libgcrypt</code> to <code class="language-plaintext highlighter-rouge">/opt/homebrew/opt/libgcrypt-rename</code> (optional dep for qca)</li>
<li>Rename <code class="language-plaintext highlighter-rouge">/opt/homebrew/Cellar/libepoxy/</code> to <code class="language-plaintext highlighter-rouge">/opt/homebrew/Cellar/libepoxy/-rename</code> (optional dep for kdeclarative)</li>
</ul>
<p>Once you successfully compile KDE Connect, create a package with Craft then install it!</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>craft <span class="nt">--package</span> kde/applications/kdeconnect-kde
</code></pre></div></div>
<h1 id="making-changes-with-a-development-repo">Making changes with a development repo</h1>
<p>The most straightforward method is to use the <code class="language-plaintext highlighter-rouge">craft</code> command to continue building the source code. First find where <code class="language-plaintext highlighter-rouge">craft</code> downloaded the source code. This should be something like <code class="language-plaintext highlighter-rouge">~/CraftRoot/build/kde/applications/kdeconnect-kde/work/kdeconnect-kde-21.12.2</code>.</p>
<p>You are going to replace this directory with the latest source code. Delete the <code class="language-plaintext highlighter-rouge">kdeconnect-kde-21.12.2</code> folder, then clone the repository into the same location and rename the folder to <code class="language-plaintext highlighter-rouge">kdeconnect-kde-21.12.2</code> (or whatever the <code class="language-plaintext highlighter-rouge">kdeconnect-kde-version</code> folder was named before).</p>
<p>Then you can run the two commands below to compile and package KDE Connect. Drag the application into your Applications folder once the packaging finishes, and you can run it as usual!</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>craft --compile --install --qmerge kde/applications/kdeconnect-kde
craft --package kde/applications/kdeconnect-kde
</code></pre></div></div>
<h2 id="making-speedier-changes">Making speedier changes</h2>
<blockquote>
<p>⚠️ Some functions that required shared libraries (like loading SMS messages) do not work with this method.</p>
</blockquote>
<p>Craft takes a while to package changes which is not ideal, and I also don’t like working in a folder 7 levels down from my home directory. So below is my super-hacky development setup:</p>
<ol>
<li>Clone the KDE Connect repo anywhere.</li>
<li>Source your craft environment: <code class="language-plaintext highlighter-rouge">source ~/CraftRoot/craft/craftenv.sh</code></li>
<li><code class="language-plaintext highlighter-rouge">cd</code> to where you cloned your repo. Sourcing your craft env changes your current directory!</li>
<li>Make a build directory: <code class="language-plaintext highlighter-rouge">mkdir build</code></li>
<li>Navigate to your build directory: <code class="language-plaintext highlighter-rouge">cd build</code></li>
<li>Set up Makefiles: <code class="language-plaintext highlighter-rouge">cmake ..</code></li>
<li>Compile and install KDE Connect: <code class="language-plaintext highlighter-rouge">make install</code></li>
</ol>
<p>At this point, you will have a KDE Connect app installed to your <code class="language-plaintext highlighter-rouge">/Applications/KDE</code> folder (this is different than where <code class="language-plaintext highlighter-rouge">craft</code> installs)! However, you won’t be able to run it because this application folder is missing the necessary frameworks and resources.</p>
<p>To fix this, copy all the files from the Craft installation that do not exist in your new installation:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cp</span> <span class="nt">-n</span> <span class="nt">-r</span> /Applications/kdeconnect-indicator.app/Contents/ /Applications/KDE/kdeconnect-indicator.app/Contents/
</code></pre></div></div>
<p>Now launch a new terminal tab, source your craft environment, and run:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/Applications/KDE/kdeconnect-indicator.app/Contents/MacOS/kdeconnect-indicator
</code></pre></div></div>
<p>You’ll need to be inside your craft environment to launch the app because of how the libraries have been linked. Use this new terminal tab to launch KDE Connect and the original to re-run <code class="language-plaintext highlighter-rouge">make install</code> whenever you make a change to the source code.</p>rianadonKDE Connect is a beautiful app that syncs non-Apple devices to your Mac. The only problem is that building it on a Mac is a little … intense. Especially on arm64 / Apple Silicon architecture.Touchapad is a Touchable Button Pad2020-05-23T00:00:00+00:002020-05-23T00:00:00+00:00https://rianadon.github.io/blog/2020/05/23/touchapad<p>Smart home control is built with phones and smart hubs, which serve as our primary interface to control our smart devices. However, must we carry our smartphones or smart hubs with us to control our appliances? How can we distance ourselves from these necessary but sometimes noisy or prying devices? The Touchapad serves as an alternative solution, with a set of virtual buttons that can be placed outdoors in an electrical box or adapted to other environments.</p>
<p><img src="https://rianadon.github.io/blog/assets/2020-05-20-box1.jpg" alt="The touchapad" width="500" height="375" /></p>
<p>The buttons control devices linked to <a href="https://home-assistant.io">Home Assistant</a>. The layout of buttons can be configured from a phone, tablet, or laptop.</p>
<figure>
<div>
<img src="https://rianadon.github.io/blog/assets/2020-05-21-phone.jpg" alt="Configuration on a phone" width="302" height="227" />
<img src="https://rianadon.github.io/blog/assets/2020-05-21-website.jpg" alt="Configuration on a laptop" width="438" height="227" />
</div>
</figure>
<!-- ![img](https://rianadon.github.io/blog/assets/2020-05-21-phone.jpg) -->
<!-- ![img](https://rianadon.github.io/blog/assets/2020-05-21-website.jpg) -->
<h1 id="design">Design</h1>
<p>If the goal is be as unobtrusive as possible, why does the Touchapad use a touchscreen to display buttons? Physical buttons have three issues:</p>
<ol>
<li>
<p><strong>Communicating Consequences</strong>: The simplest of light switches are great communicators. I want to turn on a lightbulb I cannot see. I can simply flip the switch down! If the switch is already down, I know the light is already on, and I don’t need to touch the switch. But what if I connect my light to the internet and turn it off with my phone? I can no longer trust the light switch, for now the light is off but the switch is still pressed down. If I use a toggle button, I must observe whether my light is on before I press the switch, else I won’t know what pressing the switch will do. A virtual button, however, communicates its consequences. It can display whether the light is on or off through visual effects and update its status when other switches are pressed. If I turn the light off with my phone, the virtual button will update accordingly.</p>
</li>
<li>
<p><strong>Lighting</strong>: Touchscreens have backlights. These draw extra power, but allow you to easily locate the touchscreen in the dark. Physical buttons often don’t come with built-in lighting.</p>
</li>
<li>
<p><strong>Cost</strong>: Physical buttons that satisfy the first two constraints are possible to build. A toggle button with an adjacent led to indicate the state would suffice. However, it is difficult to find such a set of buttons like this, especially at a good price. I bought my touchscreen for about $15 and it offers vastly more flexibility over physical buttons.</p>
</li>
</ol>
<p>The names and states of the buttons can be configured over the Arduino’s serial port. Whenever a button is pressed, the Arduino will send a message back through the serial port.</p>
<h1 id="construction">Construction</h1>
<p>Designing the system to fit within an electrical box imposed harsh constraints. I needed to use an Arduino board to interface with the touchscreen I bought, so I decided to use an Arduino Leonardo as it is one of the few boards that comes with an Uno-style pin layout but can be interfaced with via the thinner micro USB cable.</p>
<p><img src="https://rianadon.github.io/blog/assets/2020-05-20-pcb.jpg" alt="PCB shield" width="500" height="375" /></p>
<p>PCB shield for the Arduino. Two sets of headers are used to shift the touchscreen. The chip on top handles RS232 communication.</p>
<p>The touch screen, when attached to the Arduino’s header pins, makes the assembly too wide. It needed to be shifted over several inches. To accomplish this goal, as well as to add optional support for RS232 communication, I built a PCB to shift the header positions to avoid manually soldering many wires.</p>
<figure>
<div>
<img src="https://rianadon.github.io/blog/assets/2020-05-20-plate.jpg" alt="3D-printed plate" width="370" height="278" />
<img src="https://rianadon.github.io/blog/assets/2020-05-20-part2.jpg" alt="3D-printed anchoring structure" width="370" />
</div>
<figcaption>3D printed parts to encase the assembly</figcaption>
</figure>
<p>The whole system is joined by several 3D-printed parts. A plate locks to the weatherproof enclosure and constraints the Arduino board vertically. Another part fits over the plate to fully anchor the Arduino, and it is locked in by two long bars. Finally, the PCB is attached through the header pins, screwholes, and held by plastic nuts.</p>
<figure>
<div>
<img src="https://rianadon.github.io/blog/assets/2020-05-20-assembly1.jpg" alt="Touchscreen attached to plate" width="370" height="278" />
<img src="https://rianadon.github.io/blog/assets/2020-05-20-assembly2.jpg" alt="Arduino and PCB attached to assembly" width="370" height="278" />
</div>
<figcaption>The assembly process</figcaption>
</figure>
<h1 id="software">Software</h1>
<p><img src="https://rianadon.github.io/blog/assets/2020-05-23-touchapad.svg" alt="Software architecture" width="400" height="207" /></p>
<p>The Arduino is programmed with the standard Arduino C++. However, the Arduino does not have ethernet or wifi capabilities. Rather than use shields to add functionality, which would increase size, I decided to separate responsibilities into two devices. The touchscreen is only responsible for managing its buttons, and talks about its state over serial. A Linux computer (I used a Beaglebone Black but just as easily could have used a Raspberry Pi or other mini-computer) handles the communication with Home Assistant and web configuration of the touchscreen using a Python program.</p>
<h1 id="its-open-source">It’s Open Source!</h1>
<p>All files for this project (Arduino code, CAD files, KiCad design files, Python server, and web configuration) can be found at the project’s <a href="https://github.com/rianadon/touchapad/">GitHub repository</a> if you’d like to build something similar yourself. Enjoy!</p>rianadonSmart home control is built with phones and smart hubs, which serve as our primary interface to control our smart devices. However, must we carry our smartphones or smart hubs with us to control our appliances? How can we distance ourselves from these necessary but sometimes noisy or prying devices? The Touchapad serves as an alternative solution, with a set of virtual buttons that can be placed outdoors in an electrical box or adapted to other environments.A Guide to H.264 Streaming with Regards to FRC2019-04-04T00:00:00+00:002019-04-04T00:00:00+00:00https://rianadon.github.io/blog/2019/04/04/guide-to-h264-streaming-frc<p>For two and a half years, I’ve been trying to reliably stream H.264 video from our robot to our driver station. Finally, this year, the stream worked reliably. I’m going to graciously detail how we did this. The post is structured as a tutorial, with some background theory worked in. I’ve tried to provide links wherever possible for further exploration.</p>
<p>So, I shall first start off with:</p>
<h2 id="table-of-contents">Table of Contents</h2>
<ol>
<li><a href="#org4458f40">What’s this H.264 thing and why do you want it?</a></li>
<li><a href="#org5821553">Technology overview</a>
<ol>
<li><a href="#org6b32f4b">GStreamer basics</a></li>
<li><a href="#orgccf2c0a">Some example GStreamer pipelines</a>
<!-- 1. [A very basic example](#orga866ad3) -->
<!-- 2. [More complex examples](#orgd59e5db) --></li>
</ol>
</li>
<li><a href="#org2c50ef0">GStreamer RTSP Server</a>
<!-- 1. [Installation](#org426c3ff) -->
<!-- 2. [My own server pipeline](#org2d52f60) -->
<!-- 3. [Some fries on the side](#org12e5995) -->
<!-- 4. [Extra fanciness](#orgc8f63da) -->
<!-- 5. [Some more on viewing](#org98ce02a) --></li>
<li><a href="#org261ba64">Systemd and services</a>
<ol>
<li><a href="#org8199371">Bonus: Saving limelight video</a></li>
</ol>
</li>
<li><a href="#orgfade6f9">OpenCV and GStreamer</a></li>
<li><a href="#org67c1856">Appendix: Some Code</a></li>
<li><a href="#org9fc34fc">A little postmortem</a></li>
</ol>
<p><a id="org4458f40"></a></p>
<h2 id="whats-this-h264-thing-and-why-do-you-want-it">What’s this H.264 thing and why do you want it?</h2>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-03-27-mjpegh264.png" alt="Visualization of MJPEG and H.264 differences" width="740" height="296" />
<figcaption>A well-illustrated diagram depicting the differences in encoding betweeen MJPEG and H.264</figcaption>
</figure>
<p>MJPEG and H.264 are both video codecs. They define how to compress video. The de facto standard in FRC right now is <a href="http://www.theclosedcaptioningproject.com/?p=524">MJPEG</a> (transported over HTTP). And for a pretty good reason. It’s easy and fast to compress images to JPEG; you don’t need very powerful hardware. Plenty of libraries exist to compress images to JPEG, and streaming the video to somewhere else simply requires sending the images immediately one after another over HTTP. You can also display the stream in browsers, dashboards, and just about anything that’s not your toaster.</p>
<p>MJPEG compresses each frame individually, making them more <a href="http://web.mit.edu/course/21/21.guide/techterm.htm">fuzzy</a> to save bandwidth. You could also call this spatial compression. In addition to spatial compression, H.264 utilizes temporal compression, or compressing changes between each frame. Imagine your robot is bricked (I said imagine; I know this would never happen to you). In your camera feed you see Team 1072 scoring five points in the hatch scale boiler. How much changed in that image? Not much. An MJPEG steam would send everything visible in the feed again, including the beautiful gray carpet. An H.264 stream would send only the information necessary to locate reconstruct the 1072 bot’s movement. You may imagine this would add up to lots of data savings. It does. You can fit a somewhat good-looking 320x240 video at 15 fps into 250 kilobits per second. You could stream four cameras under 1 Mbps.</p>
<blockquote>
<p>If your robot is executing victory spins, you’ll see less data saving as the scenery around will be constantly changing. However, there will still be compressible elements moving across your field of view, and in practice there are still substantial reductions in data.</p>
</blockquote>
<p>Since H.264 requires some more power (appropximately 10 more oomphs of cpu) to do its compression, you really want to use some external coprocessor to do your video encoding should you choose to run with this.</p>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-04-09-stones.png" alt="Codecs as Infinity Stones" width="467" height="180" />
<figcaption>With all six, we'd be able to compress the whole universe in half!</figcaption>
</figure>
<p>I’d rather not go too far in depth explaining how H.264 and other modern video compression codecs work. There are much better guides which already do this like <a href="https://sidbala.com/h-264-is-magic/">this introductory one</a> and <a href="https://github.com/leandromoreira/digital_video_introduction#readme">this more in-depth one</a>. These are both absolutely gorgeous and I’d recommend you read them both for educational learning and expanding your spiritual intellect. But to explain the codec, it applies to each frame the same <a href="http://blog.biamp.com/how-chroma-subsampling-works/">chroma subsampling</a> JPEG uses (dividing the image into a luminosity plus two color channels then heavily compressing the latter) then divides the compressed frame into subsections called blocks. Every so often the codec sends an I-frame or Intra frame. Opposite of the Latin root <em>inter-</em> (between/among), an <em>intra-</em> (inside/within) frame describes a single frame by itself, described only by itself, not performing any motion or temporal compression. If your video consisted of only I-frames, it would be practically the same as an MJPEG stream. P-frames (predicted from the previous frame) and B-frames (bi-predicted from the previous and following frames), describe frames by the motion of the blocks in adjacent frames. Remember how we divided each compressed frame into blocks? P-frames and B-frames encode vectors describing where each of these blocks approximately moved to. If a pink flamingo suddenly appeared in only one frame, however, it would be impossible to find that flamingo in an adjacent frame and say “Hey the flamingo moved from the top left to the center.” Unless your camera is mounted high it shouldn’t be hallucinating. Thus, to encode for spontaneous pink flamingos among other new objects in the frame, the codec also sends the difference between the actual frame and the frame predicted by block motions. Since such difference should be small, it can be heavily compressed (with <a href="https://www.youtube.com/watch?v=Q2aEzeMDHMA">DCT</a> for example) and sent very efficiently. H.264 applies some extra compression to these frames, and that’s all!</p>
<h3 id="try-this">Try this!</h3>
<p>Before I get to the content, I’m going to quickly plug Team 3494’s <a href="https://www.chiefdelphi.com/t/potential-engine-a-simple-rtsp-server-for-frc-or-anywhere-else/348935">potential-engine</a>. It’s basically a plug and play implementation of what I describe in the <a href="#org2c50ef0">GStreamer RTSP Server</a> section. That is, all the hard work is already done for you. You don’t even need to understand what GStreamer is to use it. If you’re looking to <del>stop reading</del> take a break from my writing, check out the project! And then of course keep reading.</p>
<blockquote>
<p>I lied about not needing to understand GStreamer. While potential-engine handles all your RTSP server needs, GStreamer running on the driver station with the <code class="language-plaintext highlighter-rouge">rtspsrc</code> element plays streams with <em>significantly</em> less lag than FFmpeg, so it’s helpful to learn and use GStreamer there until we get a dashboard that supports H.264. So as I said, <a href="https://www.youtube.com/watch?v=0Hkn-LSh7es">just keep reading</a>.</p>
</blockquote>
<p><a id="org5821553"></a></p>
<h2 id="technology-overview">Technology overview</h2>
<p><a href="https://www.ffmpeg.org/about.html">FFmpeg</a> and <a href="https://gstreamer.freedesktop.org/documentation/application-development/introduction/gstreamer.html">GStreamer</a> are possibly the two largest media frameworks. They’re both open source too! Both can do streaming, but I find GStreamer more extensible.</p>
<p>Often I find FFmpeg nicer for video conversion. If you ever save video off the robot, FFmpeg is great for transcoding it to other formats. However, I’ve found that in cases of video streaming, I often get too much lag and type Fs into the FFmpeg process. GStreamer, in my view, gives more transparency in how it’s handling your stream, which in turn allows for more easily tuning it to reduce lag.</p>
<p><a id="org6b32f4b"></a></p>
<h3 id="gstreamer-basics">GStreamer basics</h3>
<blockquote>
<p>This is adapted from GStreamer’s <a href="https://gstreamer.freedesktop.org/documentation/application-development/introduction/basics.html">much better tutorial</a>, which makes for good reading.</p>
</blockquote>
<p>Our end goal here is to take video from a camera and send it (or, if you’re confident in your ability, fully send it) to another computer. The process that grabs the video, decodes it, re-encodes it, wraps it up, and ships it is called a <strong>pipeline</strong>. A pipeline is like a blanket statement; it accounts for and encapsulates everything going on.</p>
<p>We have to build our pipeline somehow. Not out of wood like most FRC robots, but out of <strong>elements</strong>. An element is like that well-written function every programmer desires to write one day: it performs one specific task. That could be reading from a camera, changing color spaces, encoding to H.264, or sending the stream across the network. All these are elements.</p>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-03-27-pipeline.png" alt="A pipeline" width="500" height="216" />
<figcaption>A visualization of a 3-element pipeline</figcaption>
</figure>
<p>Elements need to be linked together. To accomplish this, each element has one or more <strong>pads</strong>. Data flows into an element’s sink pad and out of its source pad. These names may seem reversed. Why are we sending data into the sink and out of the source? Elements that provide media (such as a file reader, <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer-plugins/html/gstreamer-plugins-filesrc.html"><code class="language-plaintext highlighter-rouge">filesrc</code></a>) end with <code class="language-plaintext highlighter-rouge">src</code>. These have a single source pad. Elements to which data is sent off (<a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-base-plugins/html/gst-plugins-base-plugins-tcpserversink.html"><code class="language-plaintext highlighter-rouge">tcpserversink</code></a>) end with <code class="language-plaintext highlighter-rouge">sink</code> and have a single sink pad.</p>
<p>Pads also have data types. The tutorial above gives a great analogy:</p>
<blockquote>
<p>A pad is similar to a plug or jack on a physical device. Consider, for example, a home theater system consisting of an audio amplifier, a DVD player, and a (silent) video projector. Linking the DVD player to the amplifier is allowed because both devices have audio jacks, and linking the projector to the DVD player is allowed because both devices have compatible video jacks. Links between the projector and the amplifier may not be made because the projector and amplifier have different types of jacks. Pads in GStreamer serve the same purpose as the jacks in the home theater system.</p>
</blockquote>
<p>Here we’re working with software, not your grandpa’s video equipment. In GStreamer, data types are specified as a MIME type (for example <code class="language-plaintext highlighter-rouge">video/x-h264</code>) with a number of options (like <code class="language-plaintext highlighter-rouge">width</code>, <code class="language-plaintext highlighter-rouge">height</code>, and <code class="language-plaintext highlighter-rouge">framerate</code>).</p>
<p>Some elements support multiple data types to be inputted or outputted. GStreamer determines which data types should be used by a process called <strong>caps negotiation</strong>. Caps is short for capabilities. Each element imposes its own restrictions on caps, so the process can be seen as solving a large system of equations for the correct caps to use. Whew! This system of equations may sometimes have multiple solutions, and GStreamer might pick the right one. To recover from this doomsday fiasco, GStreamer includes a special element <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer-plugins/html/gstreamer-plugins-capsfilter.html"><code class="language-plaintext highlighter-rouge">capsfilter</code></a>, which allows you to enforce the type of data which flows through it. It’s kind of like casting, only without parentheses.</p>
<p><a id="orgccf2c0a"></a></p>
<h3 id="some-example-gstreamer-pipelines">Some example GStreamer pipelines</h3>
<blockquote>
<p>This is adapted from this other <a href="https://gstreamer.freedesktop.org/documentation/tutorials/basic/gstreamer-tools.html">GStreamer tutorial</a>. It’s also good reading.</p>
</blockquote>
<p>Install GStreamer using <a href="https://gstreamer.freedesktop.org/documentation/installing/index.html">this guide</a>. Use the 64-bit version if your computer supports it. Make sure to grab the development installer as you’ll need it to use GStreamer with OpenCV. If you don’t plan on doing this the runtime installer works too. Once GStreamer is installed and on your path, you’ll be able to run the command <code class="language-plaintext highlighter-rouge">gst-launch-1.0</code>. This is how you launch a GStreamer pipeline. If your terminal replies that the <code class="language-plaintext highlighter-rouge">gst-launch-1.0</code> command was not found, double check the GStreamer <code class="language-plaintext highlighter-rouge">bin</code> folder has been added to your path.</p>
<p><a id="orga866ad3"></a></p>
<h4 id="a-very-basic-example">A very basic example</h4>
<p>Here’s about the most basic of a pipeline you can get:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gst-launch-1.0 videotestsrc <span class="o">!</span> videoconvert <span class="o">!</span> autovideosink
</code></pre></div></div>
<p><a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-base-plugins/html/gst-plugins-base-plugins-videotestsrc.html"><code class="language-plaintext highlighter-rouge">videotestsrc</code></a> outputs <a href="https://en.wikipedia.org/wiki/SMPTE_color_bars">SMPTE color bars</a> to its source pad. <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-good-plugins/html/gst-plugins-good-plugins-autovideosink.html"><code class="language-plaintext highlighter-rouge">autovideosink</code></a> takes input from its sink pad and displays it on your screen. The data types of these two pads includes information on their color space. There may be a chance their caps (capabilities) to not account for the same color space. In this case, <code class="language-plaintext highlighter-rouge">videoconvert</code> converts between the two color spaces. It decides which color spaces to convert between as a result of caps negotiation.</p>
<p><a id="orgd59e5db"></a></p>
<h4 id="more-complex-examples">More complex examples</h4>
<p>This one takes video from a Logitech C920 camera (which outputs H.264-encoded video natively) and streams it over UDP to port 5002:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gst-launch-1.0 v4l2src <span class="nv">device</span><span class="o">=</span>/dev/video0 <span class="nv">name</span><span class="o">=</span>pipe_src <span class="se">\</span>
<span class="o">!</span> video/x-h264,width<span class="o">=</span>1280,height<span class="o">=</span>720,framerate<span class="o">=</span>15/1,profile<span class="o">=</span>baseline <span class="se">\</span>
<span class="o">!</span> h264parse <span class="o">!</span> rtph264pay <span class="nv">pt</span><span class="o">=</span>96 config-interval<span class="o">=</span>5 <span class="se">\</span>
<span class="o">!</span> udpsink <span class="nv">name</span><span class="o">=</span>udpsink0 <span class="nv">host</span><span class="o">=</span>127.0.0.1 <span class="nv">port</span><span class="o">=</span>5002 <span class="nv">async</span><span class="o">=</span><span class="nb">false</span>
</code></pre></div></div>
<p>While this worked in testing, this specific pipeline was unreliable in competition last year. It may have been because it was UDP. Perhaps it was because I wrote my own wrapper around this pipeline to determine the IP address of the driver station (which we didn’t make static), and the wrapper had bugs.</p>
<p>The <code class="language-plaintext highlighter-rouge">video/x-h264</code> in the pipeline above is short for a <code class="language-plaintext highlighter-rouge">capsfilter</code>. As the camera could output a variety of framerates and frame sizes, we must enforce the size and framerate we want.</p>
<p>The one below takes input from a normal webcam, converts it to h264, (we used some special fisheye cameras but anyone which works with your laptop and doesn’t require fancy drivers should work) and hands it over to an app:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gst-launch-1.0 v4l2src <span class="nv">device</span><span class="o">=</span>/dev/video0 <span class="se">\</span>
<span class="o">!</span> video/x-raw,width<span class="o">=</span>320,height<span class="o">=</span>240,framerate<span class="o">=</span>15/1 <span class="se">\</span>
<span class="o">!</span> videoconvert <span class="o">!</span> video/x-raw,format<span class="o">=</span>I420 <span class="se">\</span>
<span class="o">!</span> x264enc <span class="nv">tune</span><span class="o">=</span>zerolatency <span class="nv">bitrate</span><span class="o">=</span>250 <span class="nv">threads</span><span class="o">=</span>1 <span class="se">\</span>
<span class="o">!</span> rtph264pay config-interval<span class="o">=</span>1 <span class="nv">name</span><span class="o">=</span>pay0 <span class="nv">pt</span><span class="o">=</span>96 <span class="o">!</span> appsink
</code></pre></div></div>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-03-27-pipelinepath.png" alt="A pipeline" width="720" height="128" />
<figcaption>The pipeline above, graphically</figcaption>
</figure>
<p>The <code class="language-plaintext highlighter-rouge">rtph264pay</code> element puts the stream into the RTP container format. H.264 video is basically just a stream of bits. RTP defines how these bits get assembled into packets which can then be sent one at a time over the network.
As for the H.264 encoder, <code class="language-plaintext highlighter-rouge">x264enc</code>, the bitrate of 250 is 250 Kbps, and we use one thread because we had four cameras streaming video. If you have a machine with four cores and are using four separate processes for sending video, why bother multithreading even more? The <code class="language-plaintext highlighter-rouge">tune=zerolatency</code> here is a preset that adjusts the parameters of the encoder to try to minimize the amount of time it takes to compress the video. We’re aiming for speed rather than quality here.</p>
<p><a id="org2c50ef0"></a></p>
<h2 id="gstreamer-rtsp-server">GStreamer RTSP Server</h2>
<p>I made the lucky choice of using <a href="http://www.informit.com/articles/article.aspx?p=169578&seqNum=3">RTSP</a> this year, for it offers numerous advantages that using a simple <code class="language-plaintext highlighter-rouge">udpsink</code> or something doesn’t:</p>
<ul>
<li>It’s a server. That means you don’t need a static ip address for whatever is connecting to it</li>
<li>It’s a server. That means you can connect to it multiple times and don’t need to worry about something else hogging the stream.</li>
<li>It’s a server. That means YOU DON’T HAVE TO WRITE YOUR OWN</li>
<li>It’s a server. That means you can negotiate settings with it (read: not having to hardcode them into the server), such as whether you’re using TCP or UDP</li>
<li>The client rtspsrc has a ton of options. Like what port ranges to receive on, etc.</li>
<li>You can have as many streams as you like controlled on a single port. So you won’t chew up the limited number allotted by FIRST.</li>
</ul>
<p>You may notice I said it’s a server. I think that’s really cool.</p>
<p><a id="org426c3ff"></a></p>
<h3 id="installation">Installation</h3>
<p>This thing doesn’t come with GStreamer. This means you’ll have to download and build it yourself. Here is a method that worked for me:</p>
<ol>
<li>Determine the gstreamer version by running <code class="language-plaintext highlighter-rouge">gst-launch-1.0 --version</code>. For me that’s <code class="language-plaintext highlighter-rouge">1.12.3</code>, so I’ll use that in future instructions.</li>
<li>Download the tarball of the correct release of the RTSP server. For me this is at <a href="http://gstreamer.freedesktop.org/src/gst-rtsp-server/gst-rtsp-server-1.12.3.tar.xz">http://gstreamer.freedesktop.org/src/gst-rtsp-server/gst-rtsp-server-1.12.3.tar.xz</a>. Note the version number in here. You can use <code class="language-plaintext highlighter-rouge">wget</code> for downloading.</li>
<li>Unzip that tarball by executing <code class="language-plaintext highlighter-rouge">tar -xf gst-rtsp-server-1.12.3.tar.xz</code></li>
<li>At this point you may delete the tarball</li>
<li>Navigate to the source directory (<code class="language-plaintext highlighter-rouge">cd gst-rtsp-server-1.12.3</code>)</li>
<li>Run <code class="language-plaintext highlighter-rouge">./configure</code></li>
<li>Compile! Run <code class="language-plaintext highlighter-rouge">make -j4</code> (the <code class="language-plaintext highlighter-rouge">-j4</code> splits this into 4 processes to run on 4 cores; use a different number if you have a more powerful computer)</li>
</ol>
<p>There should now be lots of little binaries in the <code class="language-plaintext highlighter-rouge">examples</code> folder!</p>
<p><a id="org2d52f60"></a></p>
<h3 id="my-own-server-pipeline">My own server pipeline</h3>
<p>To get started, you can use the <code class="language-plaintext highlighter-rouge">test-launch</code> binary to play with various pipelines. It takes in the port you want to run the rtsp server on (for exampe 5800 to be compatible with FRC rules) as an argument as well as the GStreamer pipeline you’d like to use. For example:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./test-launch <span class="nt">-p</span> 5800 <span class="s2">"( videotestsrc ! x264enc ! rtph264pay name=pay0 pt=96 )"</span>
</code></pre></div></div>
<p>There are several ways to view this stream. One is to use VLC. Open a network stream at the url <code class="language-plaintext highlighter-rouge">rtsp://localhost:5800/test</code> and you should see some video. While this method works and you may already have VLC installed, it adds significant lag for some reason. I would 96/100 would not recommended for anything except testing that the stream opens correctly.</p>
<p>The other option is to use GStreamer, which works much better. Use the following command to view your stream:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gst-launch-1.0 rtspsrc <span class="nv">location</span><span class="o">=</span>rtsp://localhost:5800/test <span class="nv">latency</span><span class="o">=</span>0 <span class="se">\</span>
<span class="o">!</span> rtph264depay <span class="o">!</span> decodebin <span class="o">!</span> autovideosink
</code></pre></div></div>
<p>Unlike VLC, the stream should launch almost instantly.</p>
<p>If you’d like to be more advanced, you can use one of the pipelines several sections above to take input from your camera, which may or may not be more helpful for FRC than pixels colored in some predetermined pattern. This is one example which takes input from any camera you may have nearby and runs encoding on your computer:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./test-launch <span class="nt">-p</span> 5800 <span class="s2">"( v4l2src device=/dev/video0 ! video/x-raw,width=320,height=240,framerate=15/1 ! videoconvert ! video/x-raw,format=I420 ! x264enc tune=zerolatency bitrate=250 threads=1 ! rtph264pay config-interval=1 name=pay0 pt=96 )"</span>
</code></pre></div></div>
<p>When viewing the stream, you should see a nice-looking grey rectangle. After a few seconds, the H.264 encoder will send you an I frame and the video thereafter should be representative of what your camera sees.</p>
<p><a id="org12e5995"></a></p>
<h3 id="some-fries-on-the-side">Some fries on the side</h3>
<p>For some, one camera feed is not enough. You might want a camera on the side of your robot, perhaps underneath, or maybe facing up so you don’t have to strain your neck to look at the ceiling when you’re bored. While this task may seem daunting, don’t worry! There’s more to the GStreamer RTSP server project than just this one command line example. It’s an <em>example</em> after all. To add multiple feeds I’ll work off the <code class="language-plaintext highlighter-rouge">test-launch.c</code> example. Here are the changes you’ll need to make:</p>
<p><a id="org691f2dc"></a></p>
<h4 id="1-hardcode-the-pipeline">1. Hardcode the pipeline</h4>
<p>If you’re feeling like a good programmer, you could pass all the pipelines through the command line and support any number of them. I do not identify myself as such a mythical programmer, so I will instead embed the pipelines in the source code.</p>
<p>Replace the line <code class="language-plaintext highlighter-rouge">gst_rtsp_media_factory_set_launch (factory, argv[1]);</code> with something like:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">gst_rtsp_media_factory_set_launch</span> <span class="p">(</span><span class="n">factory</span><span class="p">,</span> <span class="s">"( v4l2src device=/dev/video0 ! video/x-raw,width=320,height=240,framerate=15/1 ! videoconvert ! video/x-raw,format=I420 ! x264enc tune=zerolatency bitrate=250 threads=1 ! rtph264pay config-interval=1 name=pay0 pt=96 )"</span><span class="p">);</span>
</code></pre></div></div>
<p><a id="org2d1bd1a"></a></p>
<h4 id="2-add-the-second-pipeline">2. Add the second pipeline!</h4>
<p>Before the line you just changed, there’s a line which creates a <code class="language-plaintext highlighter-rouge">factory</code> object. This line creates an object used to stream our pipeline. The two lines of code below also help accomplish this mission. To add a second pipeline, you’ll need to copy these four lines and modify them a little. Change the pipeline string to something different this time and change the url the pipeline is served on.</p>
<p>In total, you should be adding four lines similar to the following directly before the line <code class="language-plaintext highlighter-rouge">g_object_unref (mounts);</code>:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">factory</span> <span class="o">=</span> <span class="n">gst_rtsp_media_factory_new</span> <span class="p">();</span>
<span class="n">gst_rtsp_media_factory_set_launch</span> <span class="p">(</span><span class="n">factory</span><span class="p">,</span> <span class="s">"( videotestsrc ! x264enc ! rtph264pay name=pay0 pt=96 )"</span><span class="p">);</span>
<span class="n">gst_rtsp_media_factory_set_shared</span> <span class="p">(</span><span class="n">factory</span><span class="p">,</span> <span class="n">TRUE</span><span class="p">);</span>
<span class="n">gst_rtsp_mount_points_add_factory</span> <span class="p">(</span><span class="n">mounts</span><span class="p">,</span> <span class="s">"/fries"</span><span class="p">,</span> <span class="n">factory</span><span class="p">);</span>
</code></pre></div></div>
<p>Repeat this step as many times as you’d like for more streams!</p>
<p><a id="orgc779709"></a></p>
<h4 id="3-compile">3. Compile!</h4>
<p>Save <code class="language-plaintext highlighter-rouge">test-launch.c</code>, open a terminal in the same directory, and run <code class="language-plaintext highlighter-rouge">make</code>. If the code compiles without errors or glaring red text, running <code class="language-plaintext highlighter-rouge">./test-launch -p 5800</code> should start an RTSP server with two streams accessible by <code class="language-plaintext highlighter-rouge">rtsp://localhost:5800/test</code> and <code class="language-plaintext highlighter-rouge">rtsp://localhost:5800/fries</code>.</p>
<p><a id="orgc8f63da"></a></p>
<h3 id="extra-fanciness">Extra fanciness</h3>
<p>There’s much more to the RTSP server. Examples of these other “much more”s are in the <code class="language-plaintext highlighter-rouge">example</code> folder. Perhaps you are concerned about job security and would like to password protect your stream so you are the only person on the team capable of opening it. The file <code class="language-plaintext highlighter-rouge">test-auth.c</code> has an example of how to implement authentication.</p>
<h3 id="you-said-you-wanted-to-do-vision-processing-too">You said you wanted to do vision processing too?</h3>
<p>I have said nothing of OpenCV up to this point except name-dropping it in the table of contents. If you’d like to use it (or your generated GRIP code) to do target detection with the video feed you are streaming, you have three possible options:</p>
<ol>
<li>Fetch the stream from the RTSP server then decode it</li>
<li>Write the stream from the RTSP server to <a href="https://en.wikipedia.org/wiki/Shared_memory">shared memory</a> then access the memory via OpenCV</li>
<li>Open the camera from OpenCV while writing to shared memory then read the memory within the RTSP server</li>
</ol>
<p>Option 1 is simply inefficient. By encoding then decoding on the same computer, you’re only using extra CPU cycles, consuming more energy, which depending on where you obtain your energy from is either raising CO2 levels, milling more wind, damming more water, or stealing more light from the sun. Option 2 may work, but likely won’t. The pipelines created within the RTSP server are only run when a client is connected. If no clients are connected, the pipeline is not run. This means your vision won’t work unless you are connected to the RTSP server. Maybe this is how you control your AI algorithms, but option 3 is likely better. Assuming your OpenCV code doesn’t frequently segfault or throw NullPointerExceptions, it will constant write new frames to shared memory, and the RTSP server will only read these when a client is connected.</p>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-04-09-pipelinecomplicated.png" alt="An illustration of the pipeline" width="596" height="332" />
<figcaption>An illustration of option 3, which you better be using</figcaption>
</figure>
<p>In order to do this magic you’re going to need OpenCV to be built with GStreamer support on whatever device you’re using. Perhaps your package repositories come with said support. <a href="https://www.learnopencv.com/get-opencv-build-information-getbuildinformation/">Check your build options</a> to see if such support exists. If not, install packages named something like <code class="language-plaintext highlighter-rouge">libgstreamer1.0-dev</code> and <code class="language-plaintext highlighter-rouge">libgstreamer-plugins-base1.0-dev</code> and compile OpenCV (with Python support if you want it) following instructions from one of the many great guides on the interwebs.</p>
<p>Afterwards, believe my explanation on why option 3 is better and then implement it.</p>
<p>To write to shared memory via OpenCV, use the <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-bad-plugins/html/gst-plugins-bad-plugins-shmsink.html"><code class="language-plaintext highlighter-rouge">shmsink</code></a> element. Rather than doing something like <code class="language-plaintext highlighter-rouge">VideoCapture(2)</code>, use:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">VideoCapture</span><span class="p">(</span><span class="s">'v4l2src device=/dev/video2 ! '</span>
<span class="s">'video/x-raw,width=320,height=240,framerate=15/1 ! '</span>
<span class="s">'tee name=t ! queue ! videoconvert ! appsink drop=true '</span>
<span class="s">'t. ! queue ! shmsink socket-path=/tmp/wassup'</span><span class="p">)</span>
</code></pre></div></div>
<p>Then in your RTSP server use:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">gst_rtsp_media_factory_set_launch</span> <span class="p">(</span><span class="n">factory</span><span class="p">,</span> <span class="s">"( shmsrc socket-path=/tmp/wassup ! <INSERT CAPS HERE> ! videoconvert ! video/x-raw,format=I420 ! x264enc tune=zerolatency bitrate=250 threads=1 ! rtph264pay config-interval=1 name=pay0 pt=96 )"</span><span class="p">);</span>
</code></pre></div></div>
<p>Because GStreamer is only writing the frame contents to memory, it won’t know what the format of these frames actually are when it reads them. For all it knows they could be base-64 encoded then phoenetically translated to the Russian alphabet and translated back to Pig Latin. Thus, you’ll need to determine the data type (caps) GStreamer writes to the <code class="language-plaintext highlighter-rouge">shmsink</code> and specify this data type when you read the frames. Run the first pipeline using <code class="language-plaintext highlighter-rouge">gst-launch-1.0 -v v4l2src ...</code>, and you’ll see the caps printed in the console. Replace <code class="language-plaintext highlighter-rouge"><INSERT CAPS HERE></code> with the line that corresponds to the <code class="language-plaintext highlighter-rouge">shmsink</code> element, transcribed to fit the format you’ve been seeing here (which only entails removing the <code class="language-plaintext highlighter-rouge">(int)</code> and <code class="language-plaintext highlighter-rouge">(string)</code> strings).</p>
<p>You can probably tell I haven’t implemented any of the shared memory stuff recently. Take these instructions with a grain of salt and if they don’t work or are too vague, please consider creating an issue on <a href="https://github.com/rianadon/blog/issues">my blog repository</a>.</p>
<h4 id="drawing-contours-and-debugging-info">Drawing contours and debugging info</h4>
<p>If you’re running vision processing on the images you’re streaming, you may also be interested in adding overlays to your stream, such as the contours and bounding boxes your algorithm detects. Rather than three methods as before, there are only two ways you can accomplish this:</p>
<ol>
<li>Open your stream on the driver station using OpenCV (I cover the installation steps required for this later in the post). You can then use OpenCV’s drawing functions and then use <code class="language-plaintext highlighter-rouge">imshow</code> to display your frames.</li>
<li>Use a language that has GStreamer bindings. I could never get the Python ones to work on Windows (as of a year and a half ago), but Java and C both work. Use these library bindings to create a pipeline with an <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-bad-plugins/html/gst-plugins-bad-plugins-rsvgoverlay.html"><code class="language-plaintext highlighter-rouge">rsvgoverlay</code></a>, then edit the element’s <code class="language-plaintext highlighter-rouge">data</code> property to reflect an SVG representation of your contours.</li>
</ol>
<p>Option 1 is much easier, but may be slightly more laggy than option 2. If you’d like to embark on option 2, you could attempt to understand some <a href="https://github.com/HarkerRobo/SillyDashboard2017/blob/master/src/CameraStream.java">undocumented Java code</a> I once wrote to edit an <code class="language-plaintext highlighter-rouge">rsvgoverlay</code> (check out the <code class="language-plaintext highlighter-rouge">createStream</code> method and the block on line 78). This is really the sort of thing I would write another blog post on, but writing about Java isn’t my cup of tea.</p>
<p><a id="org98ce02a"></a></p>
<h3 id="some-more-on-viewing">Some more on viewing</h3>
<p>As mentioned before, the <code class="language-plaintext highlighter-rouge">rtspsrc</code> element comes with a plethora of features. Running <code class="language-plaintext highlighter-rouge">gst-inspect-1.0 rtspsrc</code> will list you them. You can use <code class="language-plaintext highlighter-rouge">protocols=tcp</code> to force the server to send the stream over TCP (which we have not had to do but may be needed if UDP proves unreliable for you) or <code class="language-plaintext highlighter-rouge">port-range=5800-5810</code> to force the stream to be sent back via a limited subset of ports.</p>
<blockquote>
<p>Historically the FMS has seemed to filter only the ports servers are hosted on, which means clients can connect on any port they like. So while the RTSP server has to be on a legal port like 5800, we’ve been able to connect on any random port number. Thus, we haven’t needed to use the <code class="language-plaintext highlighter-rouge">port-range</code> option. But it’s there if you ever need it.</p>
</blockquote>
<p><a id="org261ba64"></a></p>
<h2 id="systemd-and-services">Systemd and services</h2>
<p>If you’re running GStreamer on Linux, chances are your system comes with a tool for starting services on boot and logging them called <a href="https://www.linode.com/docs/quick-answers/linux-essentials/what-is-systemd/">systemd</a>. While it is <a href="https://www.zdnet.com/article/linus-torvalds-and-others-on-linuxs-systemd/">controversial</a> due primarily to its <a href="http://without-systemd.org/wiki/index.php/Arguments_against_systemd">feature creep among other issues</a>, it’s neverless utilized by many of the processes your computer runs on boot and in the background. To automatically run our RTSP server on boot, I’ll <a href="https://www.devdungeon.com/content/creating-systemd-service-files">create a systemd service</a>.</p>
<p>Here is the service file I’ll use:</p>
<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># /etc/systemd/system/streaming.service
</span>
<span class="nn">[Unit]</span>
<span class="py">Description</span><span class="p">=</span><span class="s">Team 1072 Camera Streaming</span>
<span class="nn">[Service]</span>
<span class="py">ExecStart</span><span class="p">=</span><span class="s">/home/team1072/startstreaming.sh</span>
<span class="py">KillMode</span><span class="p">=</span><span class="s">process</span>
<span class="py">Restart</span><span class="p">=</span><span class="s">on-failure</span>
<span class="py">RestartPreventExitStatus</span><span class="p">=</span><span class="s">255</span>
<span class="py">Type</span><span class="p">=</span><span class="s">simple</span>
<span class="py">WorkingDirectory</span><span class="p">=</span><span class="s">/home/team1072/video</span>
<span class="nn">[Install]</span>
<span class="py">WantedBy</span><span class="p">=</span><span class="s">multi-user.target</span>
</code></pre></div></div>
<p>You’ll notice here I refer to this <code class="language-plaintext highlighter-rouge">startstreaming.sh</code> script. The contents of it are below:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="nb">set</span> <span class="nt">-e</span> <span class="c"># Exit if any of the commands below error</span>
<span class="c"># Reload the uvcvideo driver with a specific quirks option</span>
<span class="c"># quirks=128 enables a fix for fixing bandwidth issues with the type of cameras we use.</span>
<span class="c"># Basically, it allows us to use multiple cameras on the same usb port.</span>
rmmod uvcvideo
modprobe uvcvideo <span class="nv">quirks</span><span class="o">=</span>128
<span class="c"># Run the test-launch script with gstreamer debugging turned on</span>
<span class="nv">GST_DEBUG</span><span class="o">=</span>2 /home/team1072/gst-rtsp-server-1.12.3/examples/test-launch <span class="nt">-p</span> 5800
</code></pre></div></div>
<p>Now instruct systemd to load the service files again:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo </span>systemctl daemon-reload
</code></pre></div></div>
<p>To start our service and enable it on boot, run:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo </span>systemctl start streaming
<span class="nb">sudo </span>systemctl <span class="nb">enable </span>streaming
</code></pre></div></div>
<p>If you run <code class="language-plaintext highlighter-rouge">sudo systemctl status streaming</code>, you should see that the service is running and some log output from the server.</p>
<p>If you’d like to view the complete logs of the service, you can run:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>journalctl <span class="nt">-u</span> streaming
</code></pre></div></div>
<p><a id="org8199371"></a></p>
<h3 id="bonus-saving-limelight-video">Bonus: Saving limelight video</h3>
<p>Systemd is useful for just about any service you want to run. Since we use a <a href="https://limelightvision.io/">Limelight</a> on our robot, we find it useful to save its video for future debugging purposes. Below is a script to do just that:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/bash
# download-limelight.sh
# Form a filename based on the current date and time
file=videos/$(date "+%Y-%m-%d;%H-%M-%S-limelight.mjpeg")
# Download the stream to that file
gst-launch-1.0 souphttpsrc location=http://10.10.72.11:5802 ! filesink location=$file append=true
</code></pre></div></div>
<p>This is the service file which runs that script:</p>
<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># /etc/systemd/system/download-limelight.service
</span>
<span class="nn">[Unit]</span>
<span class="py">Description</span><span class="p">=</span><span class="s">Limelight video downloader</span>
<span class="nn">[Service]</span>
<span class="py">Type</span><span class="p">=</span><span class="s">simple</span>
<span class="py">ExecStart</span><span class="p">=</span><span class="s">/home/team1072/download-limelight.sh</span>
<span class="py">Restart</span><span class="p">=</span><span class="s">always</span>
<span class="py">RestartSec</span><span class="p">=</span><span class="s">30</span>
<span class="nn">[Install]</span>
<span class="py">WantedBy</span><span class="p">=</span><span class="s">multi-user.target</span>
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">RestartSec</code> parameter instructs systemd to wait 30 seconds before restarting the script in case it fails. This way, if the Limelight isn’t on, the script won’t hog any CPU power.</p>
<p><a id="orgfade6f9"></a></p>
<h2 id="opencv-and-gstreamer">OpenCV and GStreamer</h2>
<p>In the examples above GStreamer was used to view the RTSP stream. This is enough for any casual use, and you can write a batch script (or whatever shell scripting language your system supports) to launch multiple streams. However, there’s a limit: GStreamer supports a limited number of options for modifying your stream (such as adding overlays or applying perspective transforms). That’s not to say its support is basic: GStreamer can rotate your stream, draw SVGs on top, scale it, and add filters so you can post it on your social media. However, opening the stream via OpenCV drops you into a magical world.</p>
<p>Unfortunately, OpenCV doesn’t just come with GStreamer support. The package on pip doesn’t support it. The anaconda packages don’t support it by default. Perhaps Homebrew supports it with the right options, but the FRC driver station only runs on Windows so Mac software is out of the question. To gain GStreamer support, you’ll need to build OpenCV yourself. The process takes some time, so be prepared to set aside a few hours, but after building OpenCV you’ll be one FBI visit away from feeling like a hacker. Let’s get started!</p>
<p><a id="org9ee87bb"></a></p>
<h3 id="1-download-gstreamer">1. Download GStreamer</h3>
<p>If you didn’t download the development version of GStreamer, install it using <a href="https://gstreamer.freedesktop.org/documentation/installing/index.html">this guide</a>.</p>
<p><a id="org4897296"></a></p>
<h3 id="2-download-opencv">2. Download OpenCV</h3>
<p>Grab a release from the <a href="https://github.com/opencv/opencv/releases">OpenCV GitHub releases page</a>. Download the archive labeled <code class="language-plaintext highlighter-rouge">Source code</code> as a zip or tar. Unzip/tar it to some location you’ll remember.</p>
<p><a id="org01bcd20"></a></p>
<h3 id="3-download-the-build-tools">3. Download the build tools</h3>
<p>To build OpenCV you can choose between using Visual Studio and MinGW as your compiler. I’ll use Visual Studio since it’s well supported by Microsoft which made your operating system. Download the community edition from <a href="https://visualstudio.microsoft.com/downloads/">here</a> (or by searching for Visual Studio in your preferred search engine). OpenCV also uses CMake as a build tool. Binary distributions are available on the <a href="https://cmake.org/download/">CMake download site</a>.</p>
<p><a id="org86de0f0"></a></p>
<h3 id="4-install-everything-else">4. Install everything else</h3>
<p>If you’d like to use OpenCV from Python, make sure Python and NumPy (via <code class="language-plaintext highlighter-rouge">pip install numpy</code>) are installed. There are several other optimization libraries you may wish to install listed on OpenCV’s <a href="https://docs.opencv.org/3.4.3/d3/d52/tutorial_windows_install.html">tutorial</a>.</p>
<p><a id="orgee96ae1"></a></p>
<h3 id="5-create-a-build-folder">5. Create a build folder</h3>
<p>Open the OpenCV folder you downloaded from GitHub. It should have files named <code class="language-plaintext highlighter-rouge">README.md</code> and <code class="language-plaintext highlighter-rouge">CMakeLists.txt</code> in it. Create a new folder inside and name it <code class="language-plaintext highlighter-rouge">build</code>.</p>
<p><a id="org9d226ab"></a></p>
<h3 id="6-run-cmake">6. Run CMake</h3>
<p>Open the CMake graphical interface. Next to “Where is the source code:”, select the OpenCV folder. As for “Where to build the binaries:”, select the build folder you created.</p>
<p>Then click “Add Entry”. Add a path option with name <code class="language-plaintext highlighter-rouge">GSTREAMER_DIR</code> and path <code class="language-plaintext highlighter-rouge">C:/gstreamer/1.0/x86_64</code> (assuming you installed 64-bit GStreamer to <code class="language-plaintext highlighter-rouge">C:/gstreamer</code>).</p>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-03-30-cmake1.png" alt="Selecting the build and source folder location" width="469" height="373.5" />
<figcaption>The dialog you get by clicking the button with a big blue cross</figcaption>
</figure>
<p>Click the Configure button. You’ll be asked to select a compiler. Select the version of Visual Studio you installed. You’ll also be asked to select an architecture (Optional platfor for generator). Chances are you installed the 64 bit version of GStreamer, so use <code class="language-plaintext highlighter-rouge">x64</code>. If Visual Studio complains about errors linking libraries, chances are you selected the wrong architecture.</p>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-03-30-cmake2.png" alt="Selecting the compiler to use" width="501" height="360.5" />
<figcaption>A fine choice of compiler options</figcaption>
</figure>
<p>If the output in the log shows that OpenCV found GStreamer, click “Generate” then “Open Project”.</p>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-03-30-cmake4.png" alt="CMake not finding GStreamer" width="441" height="197.5" />
<figcaption>GStreamer was NO-t found! Abort! This entry should say YES.</figcaption>
</figure>
<p><a id="orgec03646"></a></p>
<h3 id="7-build-with-visual-studio">7. Build with Visual Studio</h3>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-03-30-cmake3.png" alt="Building with Visual Studio" width="431" height="220" />
<figcaption>Can you guess what version of Visual Studio this is? It isn't 2017.</figcaption>
</figure>
<p>After clicking “Open Project”, Visual Studio will open itself.</p>
<p>Before building, make sure to switch to Release mode. Unless you installed debug binaries with Python and would like to compile in Debug mode.</p>
<figure>
<img src="https://rianadon.github.io/blog/assets/2019-03-30-cmake5.png" alt="Selecting Release Mode" width="280.5" height="42.5" />
<figcaption>Switch the leftmost dropdown to Release.</figcaption>
</figure>
<p>Now to build! Under the solution panel, right click the <code class="language-plaintext highlighter-rouge">BUILD_ALL</code> task then click “Build”. This process (of building not the clicking) takes about an hour depending on how powerful your computer is. Once it finishes successfully, build the <code class="language-plaintext highlighter-rouge">INSTALL</code> task. Depending on how you installed Python, you may need to quit Visual Studio and relaunch it as Administrator to resolve permissions errors.</p>
<p><a id="org4fd8ab3"></a></p>
<h3 id="8-prosper">8. Prosper</h3>
<p>After running Python, you should be able to <code class="language-plaintext highlighter-rouge">import cv2</code>. You can open the RTSP stream using a GStreamer pipeline via the OpenCV <code class="language-plaintext highlighter-rouge">VideoCapture</code> object, such as in the following code:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span>
<span class="c1"># Open the stream via GStreamer. Note that the pipeline ends with an appsink
# element, which OpenCV reads from. The sync and drop options here instruct
# Gstreamer to not block the program waiting for new frames and to drop
# frames if OpenCV cannot read them quickly enough.
</span><span class="k">def</span> <span class="nf">makecap</span><span class="p">():</span>
<span class="k">return</span> <span class="n">cv2</span><span class="p">.</span><span class="n">VideoCapture</span><span class="p">(</span>
<span class="s">'rtspsrc location=rtsp://localhost:5800/test latency=0 ! rtph264depay ! decodebin ! appsink sync=false drop=true'</span><span class="p">,</span>
<span class="n">cv2</span><span class="p">.</span><span class="n">CAP_GSTREAMER</span>
<span class="p">)</span>
<span class="n">cap</span> <span class="o">=</span> <span class="n">makecap</span><span class="p">()</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="n">successful</span><span class="p">,</span> <span class="n">frame</span> <span class="o">=</span> <span class="n">cap</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>
<span class="k">if</span> <span class="n">successful</span><span class="p">:</span> <span class="c1"># Display the frame
</span> <span class="n">cv2</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Frame'</span><span class="p">,</span> <span class="n">frame</span><span class="p">)</span>
<span class="k">if</span> <span class="n">cv2</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span> <span class="o">==</span> <span class="nb">ord</span><span class="p">(</span><span class="s">'q'</span><span class="p">):</span>
<span class="k">break</span> <span class="c1"># Exit the program if the key q is pressed
</span> <span class="k">else</span><span class="p">:</span> <span class="c1"># If a frame can't be read try restarting the stream
</span> <span class="n">cap</span> <span class="o">=</span> <span class="n">makecap</span><span class="p">()</span>
</code></pre></div></div>
<p>You could write this code in C too, but I don’t C why you would want to. Just kidding; it’s a great language, but I’m just keeping this blog post concise by keeping code in one language.</p>
<p>The code handles auto-restarting the RTSP client, so you can leave the script running in the background and let it automatically reconnect when your RTSP server comes online. Even if you don’t have any processing to do to the frame, OpenCV is worth its return in automatic restarting your stream if your network fails.</p>
<p>If you’re not running any processing, you may prefer to continue using GStreamer’s viewing window. In that case, you can split the video (that’s what <code class="language-plaintext highlighter-rouge">tee</code> does) into one stream that’s displayed and another which is passed to OpenCV. This as an example pipeline to accomplish this:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s">'rtspsrc location=rtsp://10.10.72.12:5800/test latency=0 transport=tcp ! rtph264depay ! decodebin ! tee name=t ! queue ! videoconvert ! appsink sync=false drop=true t. ! queue ! videoconvert ! autovideosink'</span>
</code></pre></div></div>
<p><a id="org67c1856"></a></p>
<h2 id="appendix-some-code">Appendix: Some Code</h2>
<p>Throughout this tutorial I’ve been pasting snippets of code. If you’d like to see our code in practice, it’s at <a href="https://gist.github.com/rianadon/1e9528a2daedf4fcd7d2e736d8ae2164">this gist</a>.</p>
<p><a id="org9fc34fc"></a></p>
<h2 id="a-little-postmortem">A little postmortem</h2>
<p>If you closely inspected my code, you may have noticed that I name the videos I save by the current time. Most mini computers running Linux don’t happen to have a real time clock on them. Firstly, this means that if your computer isn’t connected to some internet source, it will report the incorrect time. That may or may not be okay with you. However, the computer we used (A Kangaroo PC) did something more egregious: at boot the time was always the same. That’s right; our system loved March 6 at 4:24 AM, even if we ran our code several weeks later. That meant video was stored to the same exact files. Luckily we used the <code class="language-plaintext highlighter-rouge">append=true</code> option, so the video concatenated itself into one terrible chunk.</p>
<p>I haven’t taken the time to implement such a scheme, but using strings of random characters as filenames would have been a much better option. If you’re blindly copying code from the gist above you may want to consider this. I’ll be making such changes the morning we compete at Houston.</p>rianadonFor two and a half years, I’ve been trying to reliably stream H.264 video from our robot to our driver station. Finally, this year, the stream worked reliably. I’m going to graciously detail how we did this. The post is structured as a tutorial, with some background theory worked in. I’ve tried to provide links wherever possible for further exploration.A Month or Two with Arch Linux2018-08-17T00:00:00+00:002018-08-17T00:00:00+00:00https://rianadon.github.io/blog/2018/08/17/a-month-of-arch<p>Ever since Windows released the Windows Subsystem for Linux, I’ve found
myself relying on the Linux shell and programs for more of my daily tasks.
When I could no longer easily clear my storage, I decided it was time to
switch operating systems. I had previously been playing with Arch in a
virtual machine and had fallen in love with its package manager and the
abundance of up-to-date packages in its repository, so I decided to dual
boot Arch and Windows and see how things went. This is a write-up of my
processes, experiences, and observations for reference for myself and tips for anyone who finds it useful.</p>
<p>First, I’ll start off by posting a screenshot of what my setup
currently looks like. Some of the config was adapted from <a href="https://www.reddit.com/r/unixporn/comments/8sr6pt/i3gaps_btw_i_use_arch/?utm_term=30541523555&utm_medium=comment_embed&utm_source=embed&utm_name=null&utm_content=footer">this Reddit
post</a>. The wallpaper is by a guy named <a href="https://www.pexels.com/photo/landscape-photography-of-forest-1103971/">Johannes Plenio</a>.</p>
<figure>
<img src="https://rianadon.github.io/blog/assets/2018-08-13-desktop.png" alt="My desktop" />
<figcaption>Yep, that's unfortuantely my desktop</figcaption>
</figure>
<h2 id="installing-arch">Installing Arch</h2>
<p>I decided to partition my 256 GB SSD (well not quite 256 GB) in the
following manner:</p>
<ul>
<li>500 MB for EFI</li>
<li>75 GB for Windows</li>
<li>200 MB for boot</li>
<li>31 GB encrypted LVM partition
<ul>
<li>8 GB swap partition (in case I do something very RAM-hungry)</li>
<li>25 GB root partition for Arch</li>
</ul>
</li>
<li>10 GB Windows recovery partition</li>
<li>121 GB shared data partition, encrypted with BitLocker</li>
</ul>
<p>The EFI and recovery partitions were left over from my Windows install,
so I didn’t have to touch those. I first shrank my Windows partition after
clean-installing Windows and turning off hibernation, paging, and anything
else that would be creating files as I shrunk the partition, creating the
data partition under Windows. Then I went through the install process after
booting into an Arch USB drive. Interestingly, there are a few more Windows
partitions that didn’t show up under Windows. Since they are located adjacently,
I assume Windows combines them when viewing partitions.</p>
<p>The install process was a fusion of multiple guides, the one which I
followed most being <a href="https://gist.github.com/HardenedArray/31915e3d73a4ae45adc0efa9ba458b07">this
one</a>.</p>
<h3 id="filesystems">Filesystems</h3>
<p>The EFI partition is FAT and Windows partition is NTFS as required. I chose
to make the boot partition as well as the Arch root partition EXT4-formatted,
and the shared data partition is read/written using dislocker and NTFS-3G.</p>
<h3 id="mounting">Mounting</h3>
<p>I use a combination of symbolic links and bind mounts to make my Documents,
Pictures, and other large directories stored in the shared data partition.
The reason I don’t put my full home directory in there is that the NTFS
bitlocker partition can’t store Linux permissions, and some tools like gpg
enjoy having specific permissions set on their data directories. Bind mounts
are nice because they are all described in my fstab file, but programs
like Nautilus confuse some of them for external drives.</p>
<h2 id="aur-downloaders">AUR downloaders</h2>
<p>While it is quite easy to build AUR packages with common tools (such as
cloning the AUR repository with Git, running <code class="language-plaintext highlighter-rouge">makepkg -s</code>, then
<code class="language-plaintext highlighter-rouge">pacman -U</code>), there exist programs to make this three-step process even
easier. Personally, I’ve really liked
<a href="https://github.com/aurapm/aura">Aura</a> (which is written in Haskell!) as
I can use it to install both official and user packages. There are
plenty of others out there though.</p>
<h2 id="window-managers">Window managers</h2>
<p>With the switch I decided to try out i3 (specifically i3-gaps), after realizing
my workflow of having 10 command prompt windows open just didn’t work so
well. Using i3, I’ve found myself using workspaces a lot more than I used
the equivalent tool in Windows (virtual desktops). Plus it’s really fast
to switch between windows when using multiple monitors. So thus far I’ve
been very happy switching window paradigms.</p>
<h3 id="sway-hidpi-is-not-yet-polished-as-of-writing">Sway: HiDPI is not yet polished (as of writing)</h3>
<p>Since Wayland seems to be the way of the future, I decided to try out
Sway, which is a version of i3 that uses Wayland rather than X. However,
both my terminal and browser (urxvt and Firefox) looked blurry under it.
Sway must have been trying to scale them by 2x or something.</p>
<h3 id="gnome--i3--simpler-setup">GNOME + i3 = simpler setup</h3>
<p>While I couldn’t find a method using plain i3 to render applications
correctly on my HiDPI display (<code class="language-plaintext highlighter-rouge">xrandr</code> didn’t seem to do anything), I
did find a project, <a href="https://github.com/ctrs/i3-gnome">i3-gnome</a>, that
solves my problems. After using its desktop file then launching
<code class="language-plaintext highlighter-rouge">gnome-flashback</code>, applications correctly scale. This method has the
side effect of launching many GNOME services, but I find they make the
whole setup process easier as I can use the GNOME settings application
and the background processes to handle things such as WiFi, Bluetooth,
blue light adjustment, color profiles, notifications, etc.</p>
<h3 id="gdm">GDM</h3>
<p>I also decided to use GDM for login. I needed to do a little extra work
to <a href="https://unix.stackexchange.com/questions/266586/gdm-how-to-enable-touchpad-tap-to-click">enable tapping to
click</a>.</p>
<h2 id="terminals">Terminals</h2>
<p>Currently I use urxvt, but have not yet found a way to make emoji
display correctly (I just get squares). The GNOME terminal displays them
correctly though, so I may switch to that. Urxvt also seems to sometimes
neglect re-rendering text while scrolling, which is quite strange.</p>
<h2 id="themes">Themes</h2>
<h3 id="arc-dark">Arc Dark</h3>
<p>Arc Dark is a great dark theme. Normally screenshots of GNOME
applications look atrocious, but under Arc Dark everything looks pretty.
Not much other explanation is needed.</p>
<h3 id="arc-icons-and-papirus">Arc Icons and Papirus</h3>
<p>Initially, I decided to go with the Arc Icon Theme. It’s Arc so it must
be great, right? However, I find the Papirus icon theme to have more
icons and look nicer and more consistent in my opinion.</p>
<h2 id="polybar">Polybar</h2>
<p>I’ve been using Polybar to display things such as my i3 windows, time,
wifi network, etc.</p>
<h3 id="fonts-material-icons--extended-material">Fonts (Material Icons + Extended Material)</h3>
<p>Since I use a HiDPI display, the siji font Polybar uses (a bitmap font),
displays at half the size it should. That makes it half as easy to read.
To halve the difficulty in reading the icons, I switched to both the
official Material Design Icons (I think I installed that with the AUR
package <code class="language-plaintext highlighter-rouge">ttf-material-icons</code>?) and an <a href="https://materialdesignicons.com/">unofficial, extended
version</a>. The unofficial version
includes all the official icons, but the official ones seem to center
better (emphasis on seem). I got the font for the unofficial set via the
website. The git repository hosting it also has an SVG version of the font,
which embeds both icon names and character codes, that you can pull up
in a text editor. This makes finding an icon by its name fairly easy
because I can use GNOME’s Character Map application to search for the
character code and copy the icon.</p>
<p>My config is below:</p>
<div class="language-conf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">font</span>-<span class="m">0</span> = <span class="s2">"DejaVu Sans Mono:pixelsize=18;4.5"</span>
<span class="n">font</span>-<span class="m">1</span> = <span class="n">Material</span> <span class="n">Design</span> <span class="n">Icons</span>:<span class="n">pixelsize</span>=<span class="m">18</span>;<span class="m">3</span>.<span class="m">5</span>
<span class="n">font</span>-<span class="m">2</span> = <span class="n">Material</span> <span class="n">Icons</span>:<span class="n">pixelsize</span>=<span class="m">18</span>;<span class="m">5</span>.<span class="m">5</span>
<span class="n">font</span>-<span class="m">3</span> = <span class="n">Font</span> <span class="n">Awesome</span> <span class="m">5</span> <span class="n">Free</span>:<span class="n">pixelsize</span>=<span class="m">14</span>;<span class="m">4</span>
<span class="n">font</span>-<span class="m">4</span> = <span class="n">Font</span> <span class="n">Awesome</span> <span class="m">5</span> <span class="n">Free</span> <span class="n">Solid</span>:<span class="n">pixelsize</span>=<span class="m">14</span>;<span class="m">4</span>
</code></pre></div></div>
<p>I included the Font Awesome fonts because the Material Design icons
do not have a nice set of thermometer icons.</p>
<h3 id="spotify">Spotify</h3>
<p>There exist many, many scripts to show music player info in polybar. I
use a fork I made of <a href="https://github.com/Jvanrhijn/polybar-spotify">this
one</a>.</p>
<h3 id="transparency-via-compton">Transparency via Compton</h3>
<p>To make polybar transparent, I installed the compositor compton. I could
achieve the same effect by giving polybar a background image (not
supported but it may be easy to add). But compton also enables nice
little flashy animations like windows fading out as they are minimized.
So it’s a keeper.</p>
<h2 id="xps-13-stuff-and-arch-wiki">XPS 13 stuff and Arch wiki</h2>
<p>The Arch wiki is an amazing resource. A stunning, beautiful,
super-useful, ultra-helpful, astoundingly amazing resource. Many
programs have pages on the wiki, with example config files and other
info their manual pages may leave out. The wiki also happened to have a
nice page on my laptop, pointing to wifi driver and bluetooth driver I
needed to install to get the Broadcom chip to work, as well as the
DisplayLink driver I needed to install to use my HDMI-to-USB dongle.</p>
<h2 id="rofi-and-albert">Rofi and Albert</h2>
<figure>
<img src="https://rianadon.github.io/blog/assets/2018-08-13-albert.png" alt="Albert" width="400px" />
<figcaption>Albert with the Papirus icon theme</figcaption>
</figure>
<p>I’ve always been fascinated by Apple’s Spotlight. Microsoft’s Cortana
sufficed, but its file search was slow and never seemed to give me what
I wanted. Thus, I played around with both
<a href="https://github.com/DaveDavenport/rofi">Rofi</a> and
<a href="https://github.com/albertlauncher/albert">Albert</a>. Personally I’ve
come to prefer Albert as it displays more information per file, and I
couldn’t figure out how to disable the shadows Rofi added to icons.</p>
<h2 id="dict--gnome-dictionary">dict + gnome-dictionary</h2>
<p><img src="https://rianadon.github.io/blog/assets/2018-08-13-dictionary.png" alt="GNOME Dictionary" width="400px" /></p>
<p>Albert has a nice extension to look up words in a dictionary, so I’ve
been using GNOME Dictionary a bit. However, the default setup of the app
requires internet access. Interestingly, there exists a nice program
(dictd) that allows you to set up your own dictionary server compatible
with GNOME Dictionary. There exist a plenitude of tutorials on
integrating dictd and GNOME Dictionary online, so I won’t write my own
here.</p>
<h3 id="aur-dictionaries">AUR dictionaries</h3>
<p>It seems all dictd dictionaries are in the AUR. Below are the ones I’ve
installed:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">dict-devils</code>: A dictionary limited in scope but unlimited in wit</li>
<li><code class="language-plaintext highlighter-rouge">dict-foldoc</code>: More computing words and acronyms than you thought
were possible</li>
<li><code class="language-plaintext highlighter-rouge">dict-gcide</code>: The Collaborative International Dictionary of English.
This is what most of the GNOME Dictionary searches yield results
from.</li>
<li><code class="language-plaintext highlighter-rouge">dict-wikt-en-all</code>: The English Wiktionary. Just like Wikipedia, it
has a <strong>lot</strong> of entries. Like everything.</li>
<li><code class="language-plaintext highlighter-rouge">dict-wn</code>: Wordnet, another dictionary full of entries</li>
</ul>
<p>There are likely more, but these cover most words in the English
language.</p>
<h2 id="i3-bindings-for-screenshots">i3 bindings for screenshots</h2>
<p>Initially, I made a key binding that used scrot to take screenshots. It
took me a little while to find out that I needed to use the <code class="language-plaintext highlighter-rouge">--release</code>
option with <code class="language-plaintext highlighter-rouge">bindsym</code> to make mouse selection (the <code class="language-plaintext highlighter-rouge">-s</code> option with
scrot) to work. However, scrot’s rectangle selections never turn out to
be nice rectangle selections and instead end up as modern art
selections. Preferring to avoid artifacts rather than view modern art, I
switched to Slop and Maim, which do not have this problem.</p>
<h2 id="grabc-and-xzoom">Grabc and xzoom</h2>
<p>For color picking, my current workflow is to zoom in on the region I
want to grab the color of using xzoom, then use grabc to capture the
color from the zoom window. It’s not great, but it’s the best I’ve
come up with so far.</p>
<h2 id="firefox-mods">Firefox mods</h2>
<p>I’ve been using Firefox for a variety of reasons, but one of the main
ones being the user interface is very customizable through CSS.</p>
<h3 id="tab-bar-hiding">Tab bar hiding</h3>
<p>One such modification was that I hid the tab bar. Since it takes up valuable
space, I memorize my small number of tabs (or forget about them), then
use multiple windows and i3’s tabbing system to accommodate more tabs.
This is accomplished with the following line, placed inside the
<a href="https://wiki.archlinux.org/index.php/Firefox/Tweaks#General_user_interface_CSS_settings"><code class="language-plaintext highlighter-rouge">userChrome.css</code></a>
file:</p>
<div class="language-css highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">#tabbrowser-tabs</span> <span class="p">{</span> <span class="nl">visibility</span><span class="p">:</span> <span class="nb">collapse</span> <span class="cp">!important</span><span class="p">;</span> <span class="p">}</span>
</code></pre></div></div>
<h3 id="dark-theme-support">Dark theme "support"</h3>
<p>With a dark theme, text inputs will sometimes inherit the dark
background colors and light text of the GTK theme being used. The Arch
Wiki of course
<a href="https://wiki.archlinux.org/index.php/Firefox/Tweaks#Unreadable_input_fields_with_dark_GTK.2B_themes">documents</a>
this problem and gives several solutions. Personally, I launch Firefox
with the light Arc theme.</p>
<h2 id="useful-apps">Useful apps</h2>
<h3 id="okular-vs-evince">Okular vs. Evince</h3>
<p>For viewing PDFs, I’ve tried using the editors from both KDE and GNOME.
For annotating, Okular has much greater support (you can draw and edit
highlighting colors), while GNOME has much more limitted features in
regard to annotating. However, I’m not a big fan of the sidebar and
like how many GNOME apps forgo menubars (at least while viewing them in
i3).</p>
<h4 id="kvantum">Kvantum</h4>
<p>Being based on KDE platforms (QT to be specific), Okular doesn’t
obey GTK themes and does not fit in at all by default. To fix this,
I’ve installed the theme manager Kvantum that has an Arc-Dark theme
inside of it. This one looks pretty darn close to the GTK theme.</p>
<p>To enable it for all apps, I’ve added the following line to my <code class="language-plaintext highlighter-rouge">/etc/environment</code> file along with a few other variables that make Qt scale correctly (credits to the commenters in <a href="https://github.com/linuxmint/Cinnamon/issues/4902">this GitHub issue</a> for the HiDPI fixes:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">QT_STYLE_OVERRIDE</span><span class="o">=</span>kvantum
<span class="nv">QT_SCALE_FACTOR</span><span class="o">=</span>1
<span class="nv">QT_AUTO_SCREEN_SCALE_FACTOR</span><span class="o">=</span>0
<span class="nv">QT_SCREEN_SCALE_FACTORS</span><span class="o">=</span>2
</code></pre></div></div>
<h3 id="festival">Festival</h3>
<p>Sometimes it’s nice to have text I’m writing read back to me via
text-to-speech. Festival is a nice little tool for this purpose, and
there’s also an <a href="https://www.emacswiki.org/emacs/festival.el">Emacs
script</a> that uses festival
to read selected text.</p>
<h3 id="continued-over">Continued over</h3>
<p>There’s also a number of programs I used under WSL that I’ve still
found super useful. They are listed below.</p>
<ol>
<li>
<p>Fish</p>
<p>After using bash for some time, I switched to zsh because of the
pretty powerline bars and plugins made available by Oh My Zsh.
However, it was much slower than bash under the Windows subsystem.
So I found the next best thing, fish, and started using it. Soon, I
discovered the slightly different syntax didn’t matter too much to
me (I can still write shell scripts in bash), and started relying on
the history auto-suggestions. So I’ve continued using that.</p>
</li>
<li>
<p>Weechat</p>
<p>While Weechat is primarily an IRC client, it has plugins that
support things such as Matrix, which I use. I also somewhat recently
discovered it has a package for Emacs, so I don’t even need to use
the command line interface! Although compared to many other
command-line chat applications, weechat’s interface is far, far
superior.</p>
</li>
<li>
<p>Fzf</p>
<p>In cases that I need to do a fuzzy search over my terminal history,
fzf is wonderful. The page also has a nice little script to do a
fuzzy search over launched processes then kill the ones you select.</p>
</li>
<li>
<p>Pass</p>
<p>Pass is an extremely minimalistic password manager, but it supports
syncing with git! This means I can reuse the git server I have at my
home to sync my passwords across my laptop and phone! You have no
idea how wonderful this has been.</p>
</li>
<li>
<p>Exa</p>
<p>I’ve aliased <code class="language-plaintext highlighter-rouge">ls</code> to Exa, which provides tree listings and more
coloring than the built-in tool. The main reason I use it is for the
colors though.</p>
</li>
</ol>
<h3 id="discurses-and-hangups-and-bitlbee">Discurses and Hangups and BitlBee</h3>
<p>For two of my three most used chat applications, Discord and Hangouts, I
initally used the separate client Discurses and Hangups. However, these
applications don’t have an interface as feature-filled as weechat. For
example, it’s difficult to navigate through Discurse’s channel list.
Eventually, I found an application called BitlBee which spawns an irc
server that can proxy my Discord and Hangouts accounts. The <a href="https://github.com/sm00th/bitlbee-discord">Discord
plugin</a> for BitlBee was
fairly easy to set up, but the Hangouts one was more involved. There are
nice instructions
<a href="https://demu.red/blog/2016/12/setting-up-sms-in-irc-via-bitlbee-with-purple-hangouts/here">here</a>.
The only problem is that because I run BitlBee on my laptop, it won’t
always be connected to the internet, and may not fetch every message
that comes my way. I’ve had to resort to my phone or web applications
to look up history, and BitlBee to send messages.</p>
<h2 id="drawing--still-no-solution">Drawing — still no solution</h2>
<figure>
<img alt="Drawing with Gromit-MPX" src="https://rianadon.github.io/blog/assets/2018-08-13-gromit.png" width="500px" />
<figcaption>Some flowers I drew with Gromit-MPX on top of pipes.sh</figcaption>
</figure>
<p>One of the features I most miss from Windows is the Ink Workspace.
Unfortunately, I have not found a suitable equivalent for Linux. Krita
seems to come the closest, but it uses the touchscreen only for moving
the canvas around. A mouse or pen or tablet are needed to do the
drawing. GIMP supports touchscreen drawing, but it’s a bit more than a
simple drawing application. The best option I’ve found so far is
<a href="https://github.com/bk138/gromit-mpx">Gromit-MPX</a>, which allows you to
draw over your screen, but it’s still far from the Window’s ink
workspace.</p>
<h2 id="touchegg-gestures">Touchegg: gestures</h2>
<p>There’s a program called Touchegg that allows custom gestures on
touchscreens and mice. I haven’t looked into it much, but it seems
promising.</p>
<h2 id="notifications">Notifications</h2>
<p><img src="https://rianadon.github.io/blog/assets/2018-08-13-notification.png" alt="A notification from Dunst" width="400px" /></p>
<p>While GNOME Flashback does display notifications, I wanted a little more
control over their appearance. Thus, I use Dunst. Above is how I configured mine.</p>
<h2 id="battery-life">Battery life</h2>
<p>This was one of my biggest worries with Linux. Thus far, it’s been only
somewhat worse than Windows. Apparently, people have gotten 20 hours of
battery life with Linux and my laptop, but I have yet to achieve that.
Maybe one day…</p>
<h2 id="conclusion">Conclusion</h2>
<p>Thus far, things have been looking good. It’s refreshing being able to
use a system I can configure with short commands from the terminal. And
I’m really liking the tiling window paradigm. It’ll be interesting to see
how far this goes.</p>rianadonEver since Windows released the Windows Subsystem for Linux, I’ve found myself relying on the Linux shell and programs for more of my daily tasks. When I could no longer easily clear my storage, I decided it was time to switch operating systems. I had previously been playing with Arch in a virtual machine and had fallen in love with its package manager and the abundance of up-to-date packages in its repository, so I decided to dual boot Arch and Windows and see how things went. This is a write-up of my processes, experiences, and observations for reference for myself and tips for anyone who finds it useful.Emoji vectors and optimization. Oh my! (Part 3: Assembly and assembly)2018-07-26T00:00:00+00:002018-07-26T00:00:00+00:00https://rianadon.github.io/blog/2018/07/26/fourth-post<blockquote>
<p>This is Part 3 - the third and final part. Because it’s 3rd, that means there were two posts before it. These detailed how I’m using dot products to compare vectors that corresponded to words corresponding to emoji. Unless you came here for only the asm.js, those posts give some nice background info. Also, if you were expecting proof that asm.js solves every problem out there, this is not the post for you. As a warning, things go badly.</p>
</blockquote>
<p>JavaScript can be inefficient. Taking dot products over 2800 length-300 vectors is a lengthy process. I ran this computation from my last blog post 100 times in Chrome:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">dotProduct</span><span class="p">(</span><span class="nx">view1</span><span class="p">,</span> <span class="nx">index1</span><span class="p">,</span> <span class="nx">view2</span><span class="p">,</span> <span class="nx">index2</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">let</span> <span class="nx">dot</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o"><</span> <span class="mi">300</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span> <span class="c1">// Each word vector is length 300</span>
<span class="nx">dot</span> <span class="o">+=</span> <span class="nx">view1</span><span class="p">.</span><span class="nx">getFloat32</span><span class="p">(</span><span class="nx">index1</span> <span class="o">*</span> <span class="mi">300</span> <span class="o">*</span> <span class="mi">4</span> <span class="o">+</span> <span class="nx">i</span> <span class="o">*</span> <span class="mi">4</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>
<span class="o">*</span> <span class="nx">view2</span><span class="p">.</span><span class="nx">getFloat32</span><span class="p">(</span><span class="nx">index2</span> <span class="o">*</span> <span class="mi">300</span> <span class="o">*</span> <span class="mi">4</span> <span class="o">+</span> <span class="nx">i</span> <span class="o">*</span> <span class="mi">4</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nx">dot</span>
<span class="p">}</span>
<span class="c1">// The part below gets put in a for loop to run 100 times</span>
<span class="kd">const</span> <span class="nx">products</span> <span class="o">=</span> <span class="nx">emojiVocab</span><span class="p">.</span><span class="nx">map</span><span class="p">((</span><span class="nx">_</span><span class="p">,</span> <span class="nx">emojiIndex</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">return</span> <span class="p">[</span><span class="nx">emojiIndex</span><span class="p">,</span> <span class="nx">dotProduct</span><span class="p">(</span><span class="nx">weights</span><span class="p">,</span> <span class="nx">index</span><span class="p">,</span> <span class="nx">emojiWeights</span><span class="p">,</span> <span class="nx">emojiIndex</span><span class="p">)]</span>
<span class="p">})</span>
</code></pre></div></div>
<p>and got the following results:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>On phone: 44277.39999999176 ms / 100 = 442.7739999999176 ms
On laptop: 13675.79999999725 ms / 100 = 136.7579999999725 ms
</code></pre></div></div>
<p>While 0.1 seconds isn’t bad, almost half a second is noticeable. So what’s the problem here? Well, each dot product multiplies 300 floating point numbers and then adds them all together. Let’s call that 600 floating point operations. Then multiply that by the 2800 vectors being multiplied and you get 1.68 megaFLOPs. That’s a lot of floating point operations. On my phone, this comes out to a speed of 3.8 megaFLOPS. Now most processors support <em>at least</em> one flop per instruction cycle. Then multiply that by maybe 4 for the extra indexing operations, and you get 15.2 Mhz. This means my Snapdragon 810 phone is running JavaScript slower than an Arduino Uno runs (16 Mhz).</p>
<p>How can this be fixed? Let’s drop JavaScript and move to assembly!</p>
<h2 id="handwriting-asmjs">Handwriting Asm.js</h2>
<p>There’s a thing called emscripten out there. You can compile a little C or C++ program that prints “hello world” into 50 Kb of JavaScript. Yes, it may necessary boilerplate, but I need to implement about 10 lines of JavaScript here. 1% code / 99% boilerplate is not a good ratio. So I ignore popular recommendations and write some asm.js myself. It’s actually not too terrible.</p>
<h3 id="asmjs-basics">Asm.js basics</h3>
<p>The very first asm.js example I found that works was <a href="https://gist.github.com/yomotsu/4ff8a3f441d16fd81250">this little Gist</a>. For some reason Chrome complained about the syntax when running examples taken from the <a href="http://asmjs.org/spec/latest/">asm.js specification</a>.</p>
<p>So let’s start with the (almost) simplest asm.js code I can get:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">AsmModule</span><span class="p">(</span><span class="nx">stdlib</span><span class="p">,</span> <span class="nx">foreign</span><span class="p">,</span> <span class="nx">buffer</span><span class="p">)</span> <span class="p">{</span>
<span class="dl">'</span><span class="s1">use asm</span><span class="dl">'</span>
<span class="kd">function</span> <span class="nx">test</span><span class="p">()</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="o">|</span><span class="mi">0</span>
<span class="p">}</span>
<span class="k">return</span> <span class="p">{</span> <span class="na">test</span><span class="p">:</span> <span class="nx">test</span> <span class="p">};</span>
<span class="p">}</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">AsmModule</span><span class="p">().</span><span class="nx">test</span><span class="p">())</span>
</code></pre></div></div>
<p>What’s going on here?</p>
<p>First, asm.js likes modules. The standard way of running asm.js code is that you define an Asm module, pass it a few parameters, then use it’s exported methods (in here only <code class="language-plaintext highlighter-rouge">test</code>).</p>
<p>These parameters allow you to reference <code class="language-plaintext highlighter-rouge">Math</code> modules, use JavaScript methods, and use a heap. To quote the specification:</p>
<blockquote>
<p>An asm.js module can take up to three optional parameters, providing access to external JavaScript code and data:</p>
<ul>
<li>a <strong>standard library</strong> object, providing access to a limited subset of the JavaScript <a href="http://asmjs.org/spec/latest/#standard-library">standard libraries</a>;</li>
<li>a <strong>foreign function interface</strong> (FFI), providing access to custom external JavaScript functions; and</li>
<li>a <strong>heap buffer</strong>, providing a single <a href="https://developer.mozilla.org/en-US/docs/JavaScript/Typed_arrays/ArrayBuffer"><code class="language-plaintext highlighter-rouge">ArrayBuffer</code></a> to act as the asm.js heap.</li>
</ul>
</blockquote>
<p>For now we don’t need any. The standard library and heap buffer will come in use later.</p>
<p>You’ll also notice (maybe) the weird <code class="language-plaintext highlighter-rouge">0|0</code>. No, this is not some kind of Unicode art owl eyes thing (<code class="language-plaintext highlighter-rouge">ಠ|ಠ</code>). This is how you specify types. There’s a really nice reference for this kind of stuff <a href="https://danigb.github.io/2017/asmjs">here</a>. Anyways, the <code class="language-plaintext highlighter-rouge">|0</code> makes the <code class="language-plaintext highlighter-rouge">0</code> return value an integer.</p>
<h3 id="a-little-more-complicated">A little more complicated…</h3>
<p>Since the code will eventually have to compute dot products, let’s start out with a little multiplication (with single-precision floats since that’s what the code used before).</p>
<p>Sounds easy right? Let’s take a stab at it:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">AsmModule</span><span class="p">(</span><span class="nx">stdlib</span><span class="p">,</span> <span class="nx">foreign</span><span class="p">,</span> <span class="nx">buffer</span><span class="p">)</span> <span class="p">{</span>
<span class="dl">'</span><span class="s1">use asm</span><span class="dl">'</span>
<span class="kd">function</span> <span class="nx">test</span><span class="p">(</span><span class="nx">a</span><span class="p">,</span> <span class="nx">b</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="nx">a</span> <span class="o">*</span> <span class="nx">b</span>
<span class="p">}</span>
<span class="k">return</span> <span class="p">{</span> <span class="na">test</span><span class="p">:</span> <span class="nx">test</span> <span class="p">};</span>
<span class="p">}</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">AsmModule</span><span class="p">().</span><span class="nx">test</span><span class="p">(</span><span class="mf">0.4</span><span class="p">,</span> <span class="mf">4.0</span><span class="p">))</span>
</code></pre></div></div>
<p>That should give <code class="language-plaintext highlighter-rouge">1.6</code> right?</p>
<p>Unless <code class="language-plaintext highlighter-rouge">Invalid asm.js: Unexpected token</code> equals <code class="language-plaintext highlighter-rouge">1.6</code> (this <em>is</em> JavaScript, so who knows), there’s an oopsie.</p>
<p>Asm.js is typed. Here there are no types. <code class="language-plaintext highlighter-rouge">a</code> could be an integer, double, or float. the compiler has no idea. Here, it’s expecting type declarations for the parameters. That’s done by adding type annotations as such:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">var</span> <span class="nx">fround</span> <span class="o">=</span> <span class="nx">stdlib</span><span class="p">.</span><span class="nb">Math</span><span class="p">.</span><span class="nx">fround</span>
<span class="kd">function</span> <span class="nx">test</span><span class="p">(</span><span class="nx">a</span><span class="p">,</span> <span class="nx">b</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">a</span> <span class="o">=</span> <span class="nx">fround</span><span class="p">(</span><span class="nx">a</span><span class="p">)</span>
<span class="nx">b</span> <span class="o">=</span> <span class="nx">fround</span><span class="p">(</span><span class="nx">b</span><span class="p">)</span>
<span class="k">return</span> <span class="nx">a</span> <span class="o">*</span> <span class="nx">b</span>
<span class="p">}</span>
</code></pre></div></div>
<p>First, <code class="language-plaintext highlighter-rouge">fround</code> gets taken from the JavaScript’s math library. This is how asm.js declares things as floats. The function normally rounds floats tot he nearest single precision float, so that’s a good choice for making the script run the same with and without asm!</p>
<p>Now the module requires this magic <code class="language-plaintext highlighter-rouge">stdlib</code>. We can just pass <code class="language-plaintext highlighter-rouge">window</code> for that:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">AsmModule</span><span class="p">(</span><span class="nb">window</span><span class="p">).</span><span class="nx">test</span><span class="p">(</span><span class="mf">0.4</span><span class="p">,</span> <span class="mf">4.0</span><span class="p">))</span>
</code></pre></div></div>
<p>Now when running the code, Chrome gives a <code class="language-plaintext highlighter-rouge">Invalid asm.js: Invalid return type</code>. That’s a little better! All that has to be done now is to also use <code class="language-plaintext highlighter-rouge">fround</code> to document the return type:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">return</span> <span class="nx">fround</span><span class="p">(</span><span class="nx">a</span> <span class="o">*</span> <span class="nx">b</span><span class="p">)</span>
</code></pre></div></div>
<p>And that gives <code class="language-plaintext highlighter-rouge">1.600000023841858</code> as the answer. Wallah!</p>
<h3 id="hip-hoppity-heap">Hip-hoppity-heap</h3>
<p>All the numbers we want to dot-product-ize are in an <code class="language-plaintext highlighter-rouge">ArrayBuffer</code>. That’ll have to eventually be passed to asm.js.</p>
<p>To pass a heap, it first has to be created:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">var</span> <span class="nx">heap</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">ArrayBuffer</span><span class="p">(</span><span class="mh">0x10000</span><span class="p">)</span>
</code></pre></div></div>
<p>Here, <code class="language-plaintext highlighter-rouge">0x10000</code> is the minimum size the heap can have.</p>
<p>The heap then gets passed as the third parameter to the module:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">AsmModule</span><span class="p">(</span><span class="nb">window</span><span class="p">,</span> <span class="kc">null</span><span class="p">,</span> <span class="nx">heap</span><span class="p">).</span><span class="nx">test</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">4</span><span class="p">))</span>
</code></pre></div></div>
<p>Since the floats are going to be stored in the heap, I’ve replaced the <code class="language-plaintext highlighter-rouge">0.4</code> and <code class="language-plaintext highlighter-rouge">4.0</code> (the floats to multiply), with their indexes in the ArrayBuffer. Single precision floats take up 4 bytes, so the indexes are offset by 4.</p>
<p>Speaking of storing the floats in the heaps, we can do that with a <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/DataView"><code class="language-plaintext highlighter-rouge">DataView</code></a>:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">var</span> <span class="nx">view</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">DataView</span><span class="p">(</span><span class="nx">heap</span><span class="p">)</span>
<span class="nx">view</span><span class="p">.</span><span class="nx">setFloat32</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>
<span class="nx">view</span><span class="p">.</span><span class="nx">setFloat32</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mf">4.0</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>
</code></pre></div></div>
<p>I’m going to be assuming my system is little endian. If the system isn’t, it isn’t going to work with the little-endian data I’m going to give it. We’ll have to hope for the best.</p>
<p>Then to write the asm function:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">var</span> <span class="nx">heap</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">stdlib</span><span class="p">.</span><span class="nb">Float32Array</span><span class="p">(</span><span class="nx">buffer</span><span class="p">)</span> <span class="c1">// Cast the heap to a bunch of 32 bit floats</span>
<span class="kd">function</span> <span class="nx">test</span><span class="p">(</span><span class="nx">a</span><span class="p">,</span> <span class="nx">b</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Declare a and b to be integers</span>
<span class="nx">a</span> <span class="o">=</span> <span class="nx">a</span><span class="o">|</span><span class="mi">0</span>
<span class="nx">b</span> <span class="o">=</span> <span class="nx">b</span><span class="o">|</span><span class="mi">0</span>
<span class="k">return</span> <span class="nx">fround</span><span class="p">(</span><span class="nx">heap</span><span class="p">[</span><span class="nx">a</span><span class="o">>></span><span class="nx">z</span><span class="p">]</span> <span class="o">*</span> <span class="nx">heap</span><span class="p">[</span><span class="nx">b</span><span class="o">>></span><span class="mi">2</span><span class="p">])</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">>>2</code> is needed to convert the byte indexes passed in to the float indexes of the <code class="language-plaintext highlighter-rouge">Float32Array</code>. Why couldn’t we have just passed in the float indexes like <code class="language-plaintext highlighter-rouge">test(0, 1)</code>? The asm.js compiler throws the error <code class="language-plaintext highlighter-rouge">Invalid asm.js: Expected shift of word size</code>. Who knows why that’s required.</p>
<p>Anyways, that gives back <code class="language-plaintext highlighter-rouge">1.600000023841858</code> as the result, same as before. Nothing new.</p>
<h3 id="dot-products">Dot products</h3>
<p>Now that we’ve got multiplication down let’s try addition (that’s a great order isn’t it?).</p>
<p>We’ll take the dot product of two length-300 vectors. The code sans explanation is below as I’ve already spent two sections explaining things:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">AsmModule</span><span class="p">(</span><span class="nx">stdlib</span><span class="p">,</span> <span class="nx">foreign</span><span class="p">,</span> <span class="nx">buffer</span><span class="p">)</span> <span class="p">{</span>
<span class="dl">'</span><span class="s1">use asm</span><span class="dl">'</span>
<span class="kd">var</span> <span class="nx">fround</span> <span class="o">=</span> <span class="nx">stdlib</span><span class="p">.</span><span class="nb">Math</span><span class="p">.</span><span class="nx">fround</span>
<span class="kd">var</span> <span class="nx">heap</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">stdlib</span><span class="p">.</span><span class="nb">Float32Array</span><span class="p">(</span><span class="nx">buffer</span><span class="p">)</span>
<span class="kd">var</span> <span class="nx">vectorlength</span> <span class="o">=</span> <span class="mi">300</span> <span class="c1">// In declarations we don't need the "|"</span>
<span class="kd">function</span> <span class="nx">dotprod</span><span class="p">(</span><span class="nx">a</span><span class="p">,</span> <span class="nx">b</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// These parameters will be reused as indexes</span>
<span class="nx">a</span> <span class="o">=</span> <span class="nx">a</span><span class="o">|</span><span class="mi">0</span>
<span class="nx">b</span> <span class="o">=</span> <span class="nx">b</span><span class="o">|</span><span class="mi">0</span>
<span class="kd">var</span> <span class="nx">prod</span> <span class="o">=</span> <span class="nx">fround</span><span class="p">(</span><span class="mf">0.0</span><span class="p">),</span> <span class="nx">max</span> <span class="o">=</span> <span class="mi">0</span>
<span class="c1">// Asm.js likes computations verbosely typed</span>
<span class="nx">max</span> <span class="o">=</span> <span class="p">(</span><span class="nx">a</span> <span class="o">+</span> <span class="nx">vectorlength</span><span class="o"><<</span><span class="mi">2</span><span class="p">)</span><span class="o">|</span><span class="mi">0</span><span class="p">;</span> <span class="c1">// shift vector length out by 2</span>
<span class="c1">// to account for byte indexing</span>
<span class="k">while</span> <span class="p">((</span><span class="nx">a</span><span class="o">|</span><span class="mi">0</span><span class="p">)</span> <span class="o"><</span> <span class="p">(</span><span class="nx">max</span><span class="o">|</span><span class="mi">0</span><span class="p">))</span> <span class="p">{</span> <span class="c1">// This loop should run 300 times</span>
<span class="nx">prod</span> <span class="o">=</span> <span class="nx">fround</span><span class="p">(</span><span class="nx">prod</span> <span class="o">+</span> <span class="nx">fround</span><span class="p">(</span><span class="nx">heap</span><span class="p">[</span><span class="nx">a</span><span class="o">>></span><span class="mi">2</span><span class="p">]</span> <span class="o">*</span> <span class="nx">heap</span><span class="p">[</span><span class="nx">b</span><span class="o">>></span><span class="mi">2</span><span class="p">]))</span>
<span class="c1">// increment the indexes</span>
<span class="nx">a</span> <span class="o">=</span> <span class="p">(</span><span class="nx">a</span> <span class="o">+</span> <span class="mi">4</span><span class="p">)</span><span class="o">|</span><span class="mi">0</span>
<span class="nx">b</span> <span class="o">=</span> <span class="p">(</span><span class="nx">b</span> <span class="o">+</span> <span class="mi">4</span><span class="p">)</span><span class="o">|</span><span class="mi">0</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nx">prod</span> <span class="c1">// the type of prod was already defined so no fround needed here</span>
<span class="p">}</span>
<span class="k">return</span> <span class="p">{</span> <span class="na">dotprod</span><span class="p">:</span> <span class="nx">dotprod</span> <span class="p">};</span>
<span class="p">}</span>
<span class="kd">const</span> <span class="nx">heap</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">ArrayBuffer</span><span class="p">(</span><span class="mh">0x10000</span><span class="p">)</span>
<span class="kd">const</span> <span class="nx">view</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">DataView</span><span class="p">(</span><span class="nx">heap</span><span class="p">)</span>
<span class="c1">// Take the dot product of the vectors <1, 2, 3, ... 10, 0, 0 ...></span>
<span class="c1">// and <1, 2, 3, ... 10, 0, 0 ...> = sum of squares up to and including 10</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span> <span class="nx">i</span> <span class="o"><=</span> <span class="mi">10</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">view</span><span class="p">.</span><span class="nx">setFloat32</span><span class="p">(</span><span class="nx">i</span><span class="o"><<</span><span class="mi">2</span><span class="p">,</span> <span class="nx">i</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>
<span class="nx">view</span><span class="p">.</span><span class="nx">setFloat32</span><span class="p">(</span><span class="mi">300</span><span class="o">+</span><span class="nx">i</span><span class="o"><<</span><span class="mi">2</span><span class="p">,</span> <span class="nx">i</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>
<span class="p">}</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">AsmModule</span><span class="p">(</span><span class="nb">window</span><span class="p">,</span> <span class="kc">null</span><span class="p">,</span> <span class="nx">heap</span><span class="p">).</span><span class="nx">dotprod</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">300</span><span class="o">*</span><span class="mi">4</span><span class="p">))</span>
</code></pre></div></div>
<p>Now I just need to do this 2800 times.</p>
<p>I’ll organize the heap in the following manner.</p>
<h2 id="incorporating-the-change">Incorporating the change</h2>
<p>Before I had a <code class="language-plaintext highlighter-rouge">dotProduct</code> function that given two arrays and two indexes, would return the dot product of the two 300-length vectors at those indexes. Below is a rewrite of a function to return the function defined above. This new <code class="language-plaintext highlighter-rouge">createdotter</code> function allows the asm module to be initialized only once.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">createdotter</span><span class="p">(</span><span class="nx">emojiWeights</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Create the heap</span>
<span class="kd">const</span> <span class="nx">heapUnit</span> <span class="o">=</span> <span class="mh">0x100000</span> <span class="c1">// heap size must be rounded to this</span>
<span class="kd">const</span> <span class="nx">heapSize</span> <span class="o">=</span> <span class="nx">emojiWeights</span><span class="p">.</span><span class="nx">byteLength</span> <span class="o">+</span> <span class="mi">300</span><span class="o">*</span><span class="mi">4</span> <span class="c1">// desired size</span>
<span class="kd">const</span> <span class="nx">heap</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">ArrayBuffer</span><span class="p">(</span><span class="nb">Math</span><span class="p">.</span><span class="nx">ceil</span><span class="p">(</span><span class="nx">heapSize</span> <span class="o">/</span> <span class="nx">heapUnit</span><span class="p">)</span> <span class="o">*</span> <span class="nx">heapUnit</span><span class="p">)</span>
<span class="c1">// Populate the heap with the emoji weights</span>
<span class="kd">const</span> <span class="nx">heapArray</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint8Array</span><span class="p">(</span><span class="nx">heap</span><span class="p">)</span>
<span class="kd">const</span> <span class="nx">weightsArray</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint8Array</span><span class="p">(</span><span class="nx">emojiWeights</span><span class="p">.</span><span class="nx">buffer</span><span class="p">,</span> <span class="nx">emojiWeights</span><span class="p">.</span><span class="nx">byteOffset</span><span class="p">)</span>
<span class="nx">heapArray</span><span class="p">.</span><span class="kd">set</span><span class="p">(</span><span class="nx">weightsArray</span><span class="p">,</span> <span class="mi">300</span><span class="o">*</span><span class="mi">4</span><span class="p">)</span>
<span class="c1">// Set up the module</span>
<span class="kd">const</span> <span class="nx">mod</span> <span class="o">=</span> <span class="nx">AsmModule</span><span class="p">(</span><span class="nb">window</span><span class="p">,</span> <span class="kc">null</span><span class="p">,</span> <span class="nx">heap</span><span class="p">)</span>
<span class="k">return</span> <span class="kd">function</span> <span class="nx">dotProduct</span><span class="p">(</span><span class="nx">view1</span><span class="p">,</span> <span class="nx">index1</span><span class="p">,</span> <span class="nx">_</span><span class="p">,</span> <span class="nx">emojiIndex</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Populate the heap with the vector to find the dot product with</span>
<span class="kd">const</span> <span class="nx">viewArray</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint8Array</span><span class="p">(</span><span class="nx">view1</span><span class="p">.</span><span class="nx">buffer</span><span class="p">,</span> <span class="nx">view1</span><span class="p">.</span><span class="nx">byteOffset</span><span class="p">)</span>
<span class="kd">const</span> <span class="nx">slice</span> <span class="o">=</span> <span class="nx">viewArray</span><span class="p">.</span><span class="nx">slice</span><span class="p">(</span><span class="nx">index1</span><span class="o">*</span><span class="mi">300</span><span class="o">*</span><span class="mi">4</span><span class="p">,</span> <span class="p">(</span><span class="nx">index1</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="mi">300</span><span class="o">*</span><span class="mi">4</span><span class="p">)</span>
<span class="nx">heapArray</span><span class="p">.</span><span class="kd">set</span><span class="p">(</span><span class="nx">slice</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="c1">// Call the asm.js module</span>
<span class="k">return</span> <span class="nx">mod</span><span class="p">.</span><span class="nx">dotprod</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="p">(</span><span class="nx">emojiIndex</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="mi">300</span><span class="o">*</span><span class="mi">4</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>With this change I get the following new performance numbers for the code at the beginning of the post (and using the new dotProduct function):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>On phone: 1735.899999999674 ms / 100 = 17.35899999999674 ms
On laptop: 745.900000008987 ms / 100 = 7.45900000008987 ms
</code></pre></div></div>
<p>Old timings for reference:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>On phone: 44277.39999999176 ms / 100 = 442.7739999999176 ms
On laptop: 13675.79999999725 ms / 100 = 136.7579999999725 ms
</code></pre></div></div>
<p>That’s much faster!</p>
<h1 id="tldr-dataview-is-slow">TL;DR <code class="language-plaintext highlighter-rouge">DataView</code> is slow</h1>
<p>For fun (i.e. because I had some errors and the asm.js wasn’t compiling), I tried removing the <code class="language-plaintext highlighter-rouge">'use asm'</code>, so all code ran in normal JavaScript. Interestingly, the results weren’t too much worse:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>On phone: 2094.8000000062166 ms / 100 = 20.948000000062166 ms
On laptop: 1315.3999999922235 ms / 100 = 13.153999999922235 ms
</code></pre></div></div>
<p>The only major difference between the asm.js code and the old code is that the former uses <code class="language-plaintext highlighter-rouge">Float32Array</code>s and the latter uses <code class="language-plaintext highlighter-rouge">DataView</code>s. So I swapped out the APIs for each other in my code (modifying the old <code class="language-plaintext highlighter-rouge">dotProduct</code> function), and got the following results:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>On phone: 852.1999999938998 ms / 100 = 8.521999999938998 ms
On laptop: 374.0999999863561 ms / 100 = 3.740999999863561 ms
</code></pre></div></div>
<p>So using asm.js actually made things <em>slower</em> than when using <code class="language-plaintext highlighter-rouge">Float32Array</code>s. That’s likely the result of the copying of the compare vector into the heap. Oh well, that just goes to show how optimized JavaScript is on its own, without extensions/limits such as <code class="language-plaintext highlighter-rouge">asm.js</code>.</p>
<p>Unfortunately there’s no part 4. I’m sorry. There’s only the <a href="https://rianadon.github.io/Emoji-Suggester-v2/">online implementation of this stuff</a>.</p>rianadonThis is Part 3 - the third and final part. Because it’s 3rd, that means there were two posts before it. These detailed how I’m using dot products to compare vectors that corresponded to words corresponding to emoji. Unless you came here for only the asm.js, those posts give some nice background info. Also, if you were expecting proof that asm.js solves every problem out there, this is not the post for you. As a warning, things go badly.Emoji vectors and optimization. Oh my! (Part 2: Npy files in the browser)2018-07-23T00:00:00+00:002018-07-23T00:00:00+00:00https://rianadon.github.io/blog/2018/07/23/third-post<p>In Part 1, I discussed how to implement a somewhat simple emoji searcher in Python. But Python (especially once you start having to install libraries like numpy) isn’t that portable. What if we could do this in the browser? That’s what this post is going to be all about!</p>
<p>From the methods of the last post, I saved the files <code class="language-plaintext highlighter-rouge">good_vocab.txt</code> (a list of all the words in the pared-down word2vec model), and <code class="language-plaintext highlighter-rouge">good_weights.npy</code> (the word vectors of the model). I’ll also save the <code class="language-plaintext highlighter-rouge">mapping</code> variable to <code class="language-plaintext highlighter-rouge">mapping.json</code> (which will have the word -> emoji mappings).</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="k">async</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">mapping</span> <span class="o">=</span> <span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">mapping.json</span><span class="dl">'</span><span class="p">)).</span><span class="nx">json</span><span class="p">()</span>
<span class="kd">const</span> <span class="nx">vocab</span> <span class="o">=</span> <span class="p">(</span><span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">good_vocab.txt</span><span class="dl">'</span><span class="p">)).</span><span class="nx">text</span><span class="p">()).</span><span class="nx">split</span><span class="p">(</span><span class="dl">'</span><span class="s1"> </span><span class="dl">'</span><span class="p">)</span>
<span class="kd">const</span> <span class="nx">weights</span> <span class="o">=</span> <span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">good_weights.npy</span><span class="dl">'</span><span class="p">)).</span><span class="nx">arrayBuffer</span><span class="p">()</span>
<span class="p">})()</span>
</code></pre></div></div>
<p>I’m using the fetch library here because, well, it makes the code very simple. The mapping gets parsed into a dictionary (technically object) of emoji descriptor words -> emoji characters, the words from the model as a string that gets parsed, and the weights as an <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer"><code class="language-plaintext highlighter-rouge">ArrayBuffer</code></a> (since it wouldn’t parse as UTF-8 or any other encoding).</p>
<p>And in the network panel, we see this little beauty:</p>
<p><img src="https://rianadon.github.io/blog/assets/2018-07-05-network.png" alt="Firefox network panel" /></p>
<p>16 seconds to fetch a file! From localhost! I wouldn’t even dare think of how slow this would be over the internets.</p>
<h2 id="chunking-the-input">Chunking the input</h2>
<p>To alleviate this, I’ll divide the weights file into chunks. At any given time, all we need are the word vectors for the emoji words and the word being searched up. So I’ll make one file for the emoji word vectors, and then divide all the vectors by the first letter of the word they correspond to.</p>
<p>At the same time I’ll also divide the vocabulary in this manner</p>
<p>That will give <code class="language-plaintext highlighter-rouge">data/emoji.npy</code> (emoji word vectors) and <code class="language-plaintext highlighter-rouge">data/emoji.txt</code> (emoji words); <code class="language-plaintext highlighter-rouge">data/dat-a.npy</code> (word vectors for words starting with <code class="language-plaintext highlighter-rouge">a</code> or <code class="language-plaintext highlighter-rouge">A</code>) and <code class="language-plaintext highlighter-rouge">data/dat-a.txt</code> (words starting with <code class="language-plaintext highlighter-rouge">a</code> or <code class="language-plaintext highlighter-rouge">A</code>), etc. etc.</p>
<p>This gives smaller files, but <code class="language-plaintext highlighter-rouge">dat-s.npy</code> measures 55 MB! That’s still a lot. My data plan would not be pleased. I’ll use the same process to make three-character files (so <code class="language-plaintext highlighter-rouge">aaa</code>, <code class="language-plaintext highlighter-rouge">aab</code>, <code class="language-plaintext highlighter-rouge">aha</code>, etc.)</p>
<p>Here’s the Python script in case you were interested (variables carried over from previous post):</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">permutations</span>
<span class="c1"># Write the two emoji data files
</span><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'data/emoji.txt'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">f</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="s">' '</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">emoji_words</span><span class="p">))</span>
<span class="n">np</span><span class="p">.</span><span class="n">save</span><span class="p">(</span><span class="s">'data/emoji.npy'</span><span class="p">,</span> <span class="n">emoji_norms</span><span class="p">)</span>
<span class="c1"># Make an array of the first, second, and third letters
# If a word has only one letter, make the second letter a space (so it goes in `a .txt`)
</span><span class="n">firsts</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">lower</span><span class="p">()</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">good_vocab</span><span class="p">])</span>
<span class="n">seconds</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="n">lower</span><span class="p">()</span> <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="o">></span> <span class="mi">1</span> <span class="k">else</span> <span class="s">' '</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">good_vocab</span><span class="p">])</span>
<span class="n">thirds</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">].</span><span class="n">lower</span><span class="p">()</span> <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="o">></span> <span class="mi">2</span> <span class="k">else</span> <span class="s">' '</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">good_vocab</span><span class="p">])</span>
<span class="c1"># Find what characters we'd need to have in the file names
</span><span class="n">possible</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="nb">set</span><span class="p">(</span><span class="n">firsts</span><span class="p">)</span> <span class="o">|</span> <span class="nb">set</span><span class="p">(</span><span class="n">seconds</span><span class="p">)</span> <span class="o">|</span> <span class="nb">set</span><span class="p">(</span><span class="n">thirds</span><span class="p">))</span>
<span class="n">allowed</span> <span class="o">=</span> <span class="n">possible</span><span class="p">[:</span><span class="n">possible</span><span class="p">.</span><span class="n">index</span><span class="p">(</span><span class="s">'~'</span><span class="p">)</span><span class="o">+</span><span class="mi">1</span><span class="p">]</span> <span class="c1"># Select all up to "~"
</span>
<span class="n">fs_forbid</span> <span class="o">=</span> <span class="s">'<>:"/</span><span class="se">\\</span><span class="s">|?*'</span> <span class="c1"># Characters forbidden by file system
</span>
<span class="c1"># Save the npy and txt files
</span><span class="k">def</span> <span class="nf">save</span><span class="p">(</span><span class="n">letters</span><span class="p">,</span> <span class="n">indices</span><span class="p">):</span>
<span class="n">filename</span> <span class="o">=</span> <span class="s">''</span><span class="p">.</span><span class="n">join</span><span class="p">(</span> <span class="c1"># Escape characters forbidden by file system
</span> <span class="nb">chr</span><span class="p">(</span><span class="nb">ord</span><span class="p">(</span><span class="n">c</span><span class="p">)</span><span class="o">+</span><span class="mi">128</span><span class="p">)</span> <span class="k">if</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">fs_forbid</span> <span class="k">else</span> <span class="n">c</span> <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">letters</span>
<span class="p">)</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'data/dat-{}.txt'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">filename</span><span class="p">),</span> <span class="s">'w'</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="s">'utf-8'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">f</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="s">' '</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">good_vocab</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">indices</span><span class="p">[</span><span class="mi">0</span><span class="p">]))</span>
<span class="n">np</span><span class="p">.</span><span class="n">save</span><span class="p">(</span><span class="s">'data/dat-{}.npy'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">filename</span><span class="p">),</span> <span class="n">good_weights</span><span class="p">[</span><span class="n">indices</span><span class="p">])</span>
<span class="n">threefiles</span> <span class="o">=</span> <span class="p">[]</span> <span class="c1"># All two-letter combinations that will have to be split up into three-letter
</span><span class="k">for</span> <span class="n">fl</span><span class="p">,</span> <span class="n">sl</span> <span class="ow">in</span> <span class="n">permutations</span><span class="p">(</span><span class="n">allowed</span><span class="p">,</span> <span class="mi">2</span><span class="p">):</span> <span class="c1"># Loop 2-letter combinations
</span> <span class="n">matched</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">logical_and</span><span class="p">(</span><span class="n">firsts</span> <span class="o">==</span> <span class="n">fl</span><span class="p">,</span> <span class="n">seconds</span> <span class="o">==</span> <span class="n">sl</span><span class="p">)</span> <span class="c1"># See which words match the two
</span> <span class="k">if</span> <span class="n">np</span><span class="p">.</span><span class="n">count_nonzero</span><span class="p">(</span><span class="n">matched</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span> <span class="c1"># Skip if none match
</span> <span class="k">continue</span>
<span class="k">elif</span> <span class="n">np</span><span class="p">.</span><span class="n">count_nonzero</span><span class="p">(</span><span class="n">matched</span><span class="p">)</span> <span class="o"><</span> <span class="mi">1000</span><span class="p">:</span> <span class="c1"># Save if fewer than 100 words match
</span> <span class="n">save</span><span class="p">(</span><span class="n">fl</span> <span class="o">+</span> <span class="n">sl</span><span class="p">,</span> <span class="n">np</span><span class="p">.</span><span class="n">where</span><span class="p">(</span><span class="n">matched</span><span class="p">))</span>
<span class="k">continue</span>
<span class="n">threefiles</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">fl</span> <span class="o">+</span> <span class="n">sl</span><span class="p">)</span>
<span class="k">for</span> <span class="n">tl</span> <span class="ow">in</span> <span class="n">allowed</span><span class="p">:</span> <span class="c1"># Split over the third word and save those files
</span> <span class="n">indices</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">where</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">logical_and</span><span class="p">(</span><span class="n">matched</span><span class="p">,</span> <span class="n">thirds</span> <span class="o">==</span> <span class="n">tl</span><span class="p">))</span>
<span class="k">if</span> <span class="n">indices</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">size</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="k">continue</span>
<span class="n">save</span><span class="p">(</span><span class="n">fl</span> <span class="o">+</span> <span class="n">sl</span> <span class="o">+</span> <span class="n">tl</span><span class="p">,</span> <span class="n">indices</span><span class="p">)</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'data/threefiles.json'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">json</span><span class="p">.</span><span class="n">dump</span><span class="p">(</span><span class="n">threefiles</span><span class="p">,</span> <span class="n">f</span><span class="p">)</span>
</code></pre></div></div>
<p>This ignores some of the other characters words start with (like non-english characters and emoji), I’ll assume no one is going to be searching for words with them anyways as the dataset is mostly English.</p>
<p>Here <code class="language-plaintext highlighter-rouge">allowed</code> contains the characters <code class="language-plaintext highlighter-rouge">! " # % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; = > @ ^ _ ` a b c d e f g h i i̇ j k l m n o p q r s t u v w x y z ~</code> and space.</p>
<p>After running the script, the <code class="language-plaintext highlighter-rouge">data</code> directory contains several important files:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">emoji.txt</code> and <code class="language-plaintext highlighter-rouge">emoji.npy</code>: emoji vocab and vectors</li>
<li><code class="language-plaintext highlighter-rouge">dat-8j.txt</code>, <code class="language-plaintext highlighter-rouge">dat-8j.npy</code>, <code class="language-plaintext highlighter-rouge">...</code>: words and vectors for 2-letter sequences</li>
<li><code class="language-plaintext highlighter-rouge">dat-exg.txt</code> and <code class="language-plaintext highlighter-rouge">dat-exg.npy</code>, <code class="language-plaintext highlighter-rouge">...</code>: words and vectors for 3 letter sequences</li>
<li><code class="language-plaintext highlighter-rouge">threefiles.json</code>: JSON-encoded array of two-letter sequences that have been expanded out to three-letter sequences since a lot of words start with these two letters</li>
</ul>
<h2 id="parsing-files">Parsing files</h2>
<p>Now that the files are chunked, I’ll need a function to accept the search term and fetch the matching word and vector files:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">mapping</span> <span class="o">=</span> <span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">mapping.json</span><span class="dl">'</span><span class="p">)).</span><span class="nx">json</span><span class="p">()</span>
<span class="kd">const</span> <span class="nx">emojiVocab</span> <span class="o">=</span> <span class="p">(</span><span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">data/emoji.txt</span><span class="dl">'</span><span class="p">)).</span><span class="nx">text</span><span class="p">()).</span><span class="nx">split</span><span class="p">(</span><span class="dl">'</span><span class="s1"> </span><span class="dl">'</span><span class="p">)</span>
<span class="kd">const</span> <span class="nx">emojiWeights</span> <span class="o">=</span> <span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">data/emoji.npy</span><span class="dl">'</span><span class="p">)).</span><span class="nx">arrayBuffer</span><span class="p">()</span>
<span class="kd">const</span> <span class="nx">threeFiles</span> <span class="o">=</span> <span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">data/threefiles.json</span><span class="dl">'</span><span class="p">)).</span><span class="nx">json</span><span class="p">()</span>
<span class="kd">const</span> <span class="nx">fsForbid</span> <span class="o">=</span> <span class="dl">'</span><span class="s1"><>:"/</span><span class="se">\\</span><span class="s1">|?*</span><span class="dl">'</span>
<span class="k">async</span> <span class="kd">function</span> <span class="nx">search</span><span class="p">(</span><span class="nx">term</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">term</span> <span class="o">=</span> <span class="nx">term</span><span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/ /g</span><span class="p">,</span> <span class="dl">'</span><span class="s1">_</span><span class="dl">'</span><span class="p">)</span> <span class="c1">// Fit the format of the vocab array</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">term</span><span class="p">.</span><span class="nx">length</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="k">return</span> <span class="p">[]</span> <span class="c1">// Passing nothing as input isn't valid</span>
<span class="c1">// Formulate the path to the file Python outputted earlier</span>
<span class="kd">const</span> <span class="nx">modTerm</span> <span class="o">=</span> <span class="nx">term</span><span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/./g</span><span class="p">,</span> <span class="p">(</span><span class="nx">c</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">fsForbid</span><span class="p">.</span><span class="nx">includes</span><span class="p">(</span><span class="nx">c</span><span class="p">))</span> <span class="k">return</span> <span class="nb">String</span><span class="p">.</span><span class="nx">fromCharCode</span><span class="p">(</span><span class="nx">c</span><span class="p">.</span><span class="nx">charCodeAt</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">+</span> <span class="mi">128</span><span class="p">)</span>
<span class="k">return</span> <span class="nx">c</span>
<span class="p">}).</span><span class="nx">toLowerCase</span><span class="p">()</span>
<span class="c1">// Find first, second, and third letter, substituting spaces as necessary</span>
<span class="kd">const</span> <span class="nx">fl</span> <span class="o">=</span> <span class="nx">modTerm</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="kd">const</span> <span class="nx">sl</span> <span class="o">=</span> <span class="nx">modTerm</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">||</span> <span class="dl">'</span><span class="s1"> </span><span class="dl">'</span>
<span class="kd">const</span> <span class="nx">tl</span> <span class="o">=</span> <span class="nx">modTerm</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">||</span> <span class="dl">'</span><span class="s1"> </span><span class="dl">'</span>
<span class="kd">const</span> <span class="nx">filename</span> <span class="o">=</span> <span class="nx">threeFiles</span><span class="p">.</span><span class="nx">includes</span><span class="p">(</span><span class="nx">tl</span><span class="p">)</span> <span class="p">?</span> <span class="nx">fl</span><span class="o">+</span><span class="nx">sl</span><span class="o">+</span><span class="nx">tl</span> <span class="p">:</span> <span class="nx">fl</span><span class="o">+</span><span class="nx">sl</span>
<span class="c1">// Fetch the relevant vocab and weights files</span>
<span class="kd">const</span> <span class="nx">vocab</span> <span class="o">=</span> <span class="p">(</span><span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="s2">`data/</span><span class="p">${</span><span class="nx">filename</span><span class="p">}</span><span class="s2">.txt`</span><span class="p">)).</span><span class="nx">text</span><span class="p">()).</span><span class="nx">split</span><span class="p">(</span><span class="dl">'</span><span class="s1"> </span><span class="dl">'</span><span class="p">)</span>
<span class="kd">const</span> <span class="nx">weights</span> <span class="o">=</span> <span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="s2">`data/</span><span class="p">${</span><span class="nx">filename</span><span class="p">}</span><span class="s2">.npy`</span><span class="p">)).</span><span class="nx">arrayBuffer</span><span class="p">()</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Much of this code mirrors the python code above used to generate the filenames, as we need to do that too here to locate the files.</p>
<p>Then to compare the search term with the vocab:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Inside function search(term)</span>
<span class="kd">let</span> <span class="nx">index</span> <span class="o">=</span> <span class="nx">vocab</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="nx">term</span><span class="p">)</span> <span class="c1">// Find index by case</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">index</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">lowercaseTerm</span> <span class="o">=</span> <span class="nx">term</span><span class="p">.</span><span class="nx">toLowerCase</span><span class="p">()</span> <span class="c1">// Find index without considering case</span>
<span class="nx">index</span> <span class="o">=</span> <span class="nx">vocab</span><span class="p">.</span><span class="nx">findIndex</span><span class="p">(</span><span class="nx">v</span> <span class="o">=></span> <span class="nx">v</span><span class="p">.</span><span class="nx">toLowerCase</span><span class="p">()</span> <span class="o">==</span> <span class="nx">lowercaseTerm</span><span class="p">)</span>
<span class="p">}</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">index</span><span class="p">)</span>
</code></pre></div></div>
<p>In the case that the term is not found in a case-senstive search, <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/findIndex"><code class="language-plaintext highlighter-rouge">findIndex</code></a> is used to build our own <code class="language-plaintext highlighter-rouge">indexOf</code>.</p>
<p>And to call the function:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">search</span><span class="p">(</span><span class="dl">'</span><span class="s1">hello</span><span class="dl">'</span><span class="p">)</span>
</code></pre></div></div>
<p>Now to parse that npy file…</p>
<p><a href="https://docs.scipy.org/doc/numpy/neps/npy-format.html">The docs</a> say the first 6 bytes should resolve to the magic string <code class="language-plaintext highlighter-rouge">\x93NUMPY</code>. Let’s verify that:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">magicData</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint8Array</span><span class="p">(</span><span class="nx">weights</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">6</span><span class="p">)</span>
<span class="kd">let</span> <span class="nx">magic</span> <span class="o">=</span> <span class="dl">''</span>
<span class="nx">magicData</span><span class="p">.</span><span class="nx">forEach</span><span class="p">(</span><span class="nx">c</span> <span class="o">=></span> <span class="nx">magic</span> <span class="o">+=</span> <span class="nb">String</span><span class="p">.</span><span class="nx">fromCharCode</span><span class="p">(</span><span class="nx">c</span><span class="p">))</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">magic</span><span class="p">.</span><span class="nx">charCodeAt</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="nx">magic</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span>
</code></pre></div></div>
<p>This gives <code class="language-plaintext highlighter-rouge">147 NUMPY</code>. <code class="language-plaintext highlighter-rouge">147</code> is <code class="language-plaintext highlighter-rouge">0x93</code> in Hex, so the file is valid!</p>
<p>Next comes the major and minor version numbers.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const [major, minor] = new Uint8Array(weights, 6, 2)
console.log(`version ${major}.${minor}`)
</code></pre></div></div>
<p>For me that’s version <code class="language-plaintext highlighter-rouge">1.0</code>.</p>
<p>Next comes the header length. It’s a 2 byte little-endian unsigned short. Then like the magic number, we can parse the header data into a string:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">headerData</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint8Array</span><span class="p">(</span><span class="nx">weights</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="nx">headerLength</span><span class="p">)</span>
<span class="kd">let</span> <span class="nx">header</span> <span class="o">=</span> <span class="dl">''</span>
<span class="nx">headerData</span><span class="p">.</span><span class="nx">forEach</span><span class="p">(</span><span class="nx">c</span> <span class="o">=></span> <span class="nx">header</span> <span class="o">+=</span> <span class="nb">String</span><span class="p">.</span><span class="nx">fromCharCode</span><span class="p">(</span><span class="nx">c</span><span class="p">))</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">header</span><span class="p">)</span>
</code></pre></div></div>
<p>This gives <code class="language-plaintext highlighter-rouge">{'descr': '<f4', 'fortran_order': False, 'shape': (496, 300), }</code>. Nice!</p>
<p>A few modifications and we can get this to JSON javascript understands:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">parsedHeader</span> <span class="o">=</span> <span class="nx">JSON</span><span class="p">.</span><span class="nx">parse</span><span class="p">(</span><span class="nx">header</span>
<span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/'/g</span><span class="p">,</span> <span class="dl">'</span><span class="s1">"</span><span class="dl">'</span><span class="p">)</span>
<span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/False/g</span><span class="p">,</span> <span class="dl">'</span><span class="s1">false</span><span class="dl">'</span><span class="p">)</span>
<span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/True/g</span><span class="p">,</span> <span class="dl">'</span><span class="s1">true</span><span class="dl">'</span><span class="p">)</span>
<span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/</span><span class="se">\(</span><span class="sr">/g</span><span class="p">,</span> <span class="dl">'</span><span class="s1">[</span><span class="dl">'</span><span class="p">)</span>
<span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/</span><span class="se">\)</span><span class="sr">/g</span><span class="p">,</span> <span class="dl">'</span><span class="s1">]</span><span class="dl">'</span><span class="p">)</span>
<span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="sr">/, }/g</span><span class="p">,</span> <span class="dl">'</span><span class="s1">}</span><span class="dl">'</span><span class="p">)</span>
<span class="p">)</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">parsedHeader</span><span class="p">)</span>
</code></pre></div></div>
<p>Gives:</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
</span><span class="nl">"descr"</span><span class="p">:</span><span class="w"> </span><span class="s2">"<f4"</span><span class="p">,</span><span class="w">
</span><span class="nl">"fortran_order"</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span><span class="w">
</span><span class="nl">"shape"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="mi">496</span><span class="p">,</span><span class="w"> </span><span class="mi">300</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>
<p>Then comes the data! Let’s read the first few floats:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">data</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">DataView</span><span class="p">(</span><span class="nx">weights</span><span class="p">,</span> <span class="mi">10</span> <span class="o">+</span> <span class="nx">headerLength</span><span class="p">)</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">data</span><span class="p">.</span><span class="nx">getFloat32</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="kc">true</span><span class="p">))</span> <span class="c1">// 0.024849990382790565</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">data</span><span class="p">.</span><span class="nx">getFloat32</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="kc">true</span><span class="p">))</span> <span class="c1">// 0.03313332051038742</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">data</span><span class="p">.</span><span class="nx">getFloat32</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="kc">true</span><span class="p">))</span> <span class="c1">// 0.019124748185276985</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">data</span><span class="p">.</span><span class="nx">getFloat32</span><span class="p">(</span><span class="mi">12</span><span class="p">,</span> <span class="kc">true</span><span class="p">))</span> <span class="c1">// 0.011755019426345825</span>
</code></pre></div></div>
<p>Seems reasonable. Now we can organize this into a little function for parsing <code class="language-plaintext highlighter-rouge">.npy</code> files:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">parseNpy</span><span class="p">(</span><span class="nx">buffer</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">magicData</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint8Array</span><span class="p">(</span><span class="nx">buffer</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">6</span><span class="p">)</span>
<span class="kd">let</span> <span class="nx">magic</span> <span class="o">=</span> <span class="dl">''</span>
<span class="nx">magicData</span><span class="p">.</span><span class="nx">forEach</span><span class="p">(</span><span class="nx">c</span> <span class="o">=></span> <span class="nx">magic</span> <span class="o">+=</span> <span class="nb">String</span><span class="p">.</span><span class="nx">fromCharCode</span><span class="p">(</span><span class="nx">c</span><span class="p">))</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">magic</span> <span class="o">!==</span> <span class="dl">'</span><span class="se">\</span><span class="s1">x93NUMPY</span><span class="dl">'</span><span class="p">)</span> <span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span><span class="p">(</span><span class="dl">'</span><span class="s1">Invalid magic string</span><span class="dl">'</span><span class="p">)</span>
<span class="kd">const</span> <span class="p">[</span><span class="nx">major</span><span class="p">,</span> <span class="nx">minor</span><span class="p">]</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint8Array</span><span class="p">(</span><span class="nx">buffer</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">major</span> <span class="o">!=</span> <span class="mi">1</span> <span class="o">||</span> <span class="nx">minor</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span><span class="p">(</span><span class="dl">'</span><span class="s1">Only version 1.0 supported</span><span class="dl">'</span><span class="p">)</span>
<span class="kd">const</span> <span class="nx">headerLength</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">DataView</span><span class="p">(</span><span class="nx">buffer</span><span class="p">).</span><span class="nx">getUint16</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>
<span class="k">return</span> <span class="k">new</span> <span class="nb">DataView</span><span class="p">(</span><span class="nx">buffer</span><span class="p">,</span> <span class="mi">10</span> <span class="o">+</span> <span class="nx">headerLength</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>
<p>To keep things simple I didn’t parse the header. I’ll hope all the files have the correct ordering and data type.</p>
<h2 id="dot-products">Dot products</h2>
<p>I’ll modify the <code class="language-plaintext highlighter-rouge">emojiWeights</code> variable to use the <code class="language-plaintext highlighter-rouge">parseNpy</code> function:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">emojiWeights</span> <span class="o">=</span> <span class="nx">parseNpy</span><span class="p">(</span><span class="k">await</span> <span class="p">(</span><span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">data/emoji.npy</span><span class="dl">'</span><span class="p">)).</span><span class="nx">arrayBuffer</span><span class="p">())</span>
</code></pre></div></div>
<p>Then I’ll make a function to take dot products along these two <code class="language-plaintext highlighter-rouge">DataViews</code>. Here <code class="language-plaintext highlighter-rouge">index1</code> and <code class="language-plaintext highlighter-rouge">index2</code> will refer to the index of our word in the vocab array.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">dotProduct</span><span class="p">(</span><span class="nx">view1</span><span class="p">,</span> <span class="nx">index1</span><span class="p">,</span> <span class="nx">view2</span><span class="p">,</span> <span class="nx">index2</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">let</span> <span class="nx">dot</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o"><</span> <span class="mi">300</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span> <span class="c1">// Each word vector is length 300</span>
<span class="nx">dot</span> <span class="o">+=</span> <span class="nx">view1</span><span class="p">.</span><span class="nx">getFloat32</span><span class="p">(</span><span class="nx">index1</span> <span class="o">*</span> <span class="mi">300</span> <span class="o">*</span> <span class="mi">4</span> <span class="o">+</span> <span class="nx">i</span> <span class="o">*</span> <span class="mi">4</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>
<span class="o">*</span> <span class="nx">view2</span><span class="p">.</span><span class="nx">getFloat32</span><span class="p">(</span><span class="nx">index2</span> <span class="o">*</span> <span class="mi">300</span> <span class="o">*</span> <span class="mi">4</span> <span class="o">+</span> <span class="nx">i</span> <span class="o">*</span> <span class="mi">4</span><span class="p">,</span> <span class="kc">true</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nx">dot</span>
<span class="p">}</span>
</code></pre></div></div>
<p>This bit of math works because each 32-bit float takes up 4 bytes. So <code class="language-plaintext highlighter-rouge">i * 4</code> gives consecutive floats. The <code class="language-plaintext highlighter-rouge">* 300 * 4</code> spans 300 floats, the number of floats encoding for each word vector.</p>
<p>So within our <code class="language-plaintext highlighter-rouge">search</code> function that’s being called with the value <code class="language-plaintext highlighter-rouge">hello</code> for now, we can do the following:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// index refers to the one defined earlier up in the search functoin</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">dotProduct</span><span class="p">(</span><span class="nx">weights</span><span class="p">,</span> <span class="nx">index</span><span class="p">,</span> <span class="nx">emojiWeights</span><span class="p">,</span> <span class="nx">emojiVocab</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="dl">'</span><span class="s1">hi</span><span class="dl">'</span><span class="p">)))</span>
</code></pre></div></div>
<p>Which gives <code class="language-plaintext highlighter-rouge">.65</code> as the similarity. That’s good (somewhat; it should really be higher. But the parsing is working. I promise you)!</p>
<p>To find matching emoji, we need to consider all of them. So we can iterate over the vocab array and find the best matches:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Compute the dot products, also keeping track of the index they correspond</span>
<span class="c1">// to in the vocab array</span>
<span class="kd">const</span> <span class="nx">products</span> <span class="o">=</span> <span class="nx">emojiVocab</span><span class="p">.</span><span class="nx">map</span><span class="p">((</span><span class="nx">_</span><span class="p">,</span> <span class="nx">emojiIndex</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">return</span> <span class="p">[</span><span class="nx">emojiIndex</span><span class="p">,</span> <span class="nx">dotProduct</span><span class="p">(</span><span class="nx">weights</span><span class="p">,</span> <span class="nx">index</span><span class="p">,</span> <span class="nx">emojiWeights</span><span class="p">,</span> <span class="nx">emojiIndex</span><span class="p">)]</span>
<span class="p">})</span>
<span class="c1">// Sort the products descending by their value (index 1 of the inner arrays)</span>
<span class="nx">products</span><span class="p">.</span><span class="nx">sort</span><span class="p">((</span><span class="nx">a</span><span class="p">,</span> <span class="nx">b</span><span class="p">)</span> <span class="o">=></span> <span class="nx">a</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">-</span> <span class="nx">b</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
<span class="nx">products</span><span class="p">.</span><span class="nx">reverse</span><span class="p">()</span>
<span class="c1">// Output the top 10 results</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o"><</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">min</span><span class="p">(</span><span class="nx">products</span><span class="p">.</span><span class="nx">length</span><span class="p">,</span> <span class="mi">10</span><span class="p">);</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">const</span> <span class="p">[</span><span class="nx">index</span><span class="p">,</span> <span class="nx">dotprod</span><span class="p">]</span> <span class="o">=</span> <span class="nx">products</span><span class="p">[</span><span class="nx">i</span><span class="p">]</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">dotprod</span><span class="p">,</span> <span class="nx">emojiVocab</span><span class="p">[</span><span class="nx">index</span><span class="p">])</span>
<span class="p">}</span>
</code></pre></div></div>
<p>This gives:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1.0000000161893414 hello
0.6548984408213944 hi
0.6399055901641588 goodbye
0.5544619264442038 hug
0.4897352942405049 smiling
0.4730159033812229 namaste
0.47205641057925707 chatting
0.45117140669361094 hugs
0.44649946734886803 smile
0.433194592057971 xox
</code></pre></div></div>
<p>Or to find the top ten emoji rather than words:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">emoji</span> <span class="o">=</span> <span class="p">[]</span> <span class="c1">// Keep track of emoji already outputted to avoid duplicates</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o"><</span> <span class="nx">products</span><span class="p">.</span><span class="nx">length</span> <span class="o">&&</span> <span class="nx">emoji</span><span class="p">.</span><span class="nx">length</span> <span class="o"><</span> <span class="mi">10</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">const</span> <span class="p">[</span><span class="nx">index</span><span class="p">,</span> <span class="nx">dotprod</span><span class="p">]</span> <span class="o">=</span> <span class="nx">products</span><span class="p">[</span><span class="nx">i</span><span class="p">]</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">emote</span> <span class="k">of</span> <span class="nx">mapping</span><span class="p">[</span><span class="nx">emojiVocab</span><span class="p">[</span><span class="nx">index</span><span class="p">]])</span> <span class="p">{</span>
<span class="c1">// If we still need more emoji and this is a new one, console.log it</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">emoji</span><span class="p">.</span><span class="nx">length</span> <span class="o"><</span> <span class="mi">10</span> <span class="o">&&</span> <span class="o">!</span><span class="nx">emoji</span><span class="p">.</span><span class="nx">includes</span><span class="p">(</span><span class="nx">emote</span><span class="p">))</span> <span class="p">{</span>
<span class="nx">emoji</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="nx">emote</span><span class="p">)</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">emote</span><span class="p">,</span> <span class="nx">emojiVocab</span><span class="p">[</span><span class="nx">index</span><span class="p">],</span> <span class="nx">dotprod</span><span class="p">.</span><span class="nx">toFixed</span><span class="p">(</span><span class="mi">3</span><span class="p">))</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Which gives:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>👋 hello 1.000
🤗 hug 0.554
😈 smiling 0.490
😙 smiling 0.490
🙂 smiling 0.490
🙏 namaste 0.473
🗨 chatting 0.472
💬 chatting 0.472
😄 smile 0.446
😛 smile 0.446
</code></pre></div></div>
<p>Yay! Next post I’ll be looking at the efficiency of the code so far.</p>rianadonIn Part 1, I discussed how to implement a somewhat simple emoji searcher in Python. But Python (especially once you start having to install libraries like numpy) isn’t that portable. What if we could do this in the browser? That’s what this post is going to be all about!Emoji vectors and optimization. Oh my! (Part 1: Word2Vec and Python)2018-07-05T00:00:00+00:002018-07-05T00:00:00+00:00https://rianadon.github.io/blog/2018/07/05/second-post<h1 id="emoji-vectors-and-optimization-oh-my-part-1-word2vec-and-python">Emoji vectors and optimization. Oh my! (Part 1: Word2Vec and Python)</h1>
<p>A while back ago, the Slack bot <a href="https://slack.com/apps/A0HKW0SFK-emojibot">EmojiBot</a> went offline, endangering my workflow. You could ask it to find an emoji by sending a query like <code class="language-plaintext highlighter-rouge">@emojibot electricity</code>, and it would reply with emoji such as :electric_plug:, :zap:, :bulb:, and :battery:. But all was okay because in the meantime, I built up my own replacement that was a <a href="https://github.com/rianadon/Emoji-Suggester/tree/master/emojiserver">Python http server</a> connected to a lovely little <a href="https://hubot.github.com/">Hubot</a> instance. This post details how I went about doing that.</p>
<p>Before I move on, a few basics on the overarching question: how does one match up text to emoji?</p>
<p>One great project I encountered was <a href="https://github.com/muan/emojilib">emojilib</a>. The project contains a mapping of emoji to a few descriptive key words each. You can find a search interface for it <a href="http://emoji.muan.co">here</a>. It works spectacularly for most emoji, but once you pull out a thesaurus things get worse. For example: <code class="language-plaintext highlighter-rouge">ghost</code> finds :ghost:, but <code class="language-plaintext highlighter-rouge">ghoul</code> turns up nothing. <code class="language-plaintext highlighter-rouge">Happy</code> finds a long list of emoji (:grinning:, :grin:, :joy:, and :smiley: to name a few), but <code class="language-plaintext highlighter-rouge">ecstatic</code> finds none. And nothing comes up for <code class="language-plaintext highlighter-rouge">Ryan</code>, my very own name, either!</p>
<p>We can do better. Google has word vectors of 300 million words in its Google News word2vec dataset! Surely there must be a way to utilize this. And there is. There are already methods to make a word2vec model out of emoji, such as the one detailed by <a href="https://arxiv.org/abs/1609.08359">this paper</a>. And while the word2vec training methods are amazingly ingenious and interesting, I decided to not dive into the depths of AI deep neural network machine learning and rather play around with Google’s word embeddings and emojilib.</p>
<p>With word2vec, you can find the similarity of two words by finding the angle between their two vectors. This is the same as taking the arc cosine dot product of their unit vectors, However, we can deal without the trigonometry and just take dot products, which gives similarities between 0 and 1.</p>
<p>With this method, I can build a way to relate fancy words (e.g. ecstatic) -> simple words (e.g. happy) -> emoji, and will be able to build a pretty good emoji searcher!</p>
<h2 id="building-a-map">Building a Map</h2>
<p>Google’s word vectors are for words, not emoji. So I need a way to translate words to emoji. Luckily, this is easy to do with emojilib and a little Python:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'emojilib.json'</span><span class="p">,</span> <span class="s">'r'</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="s">'utf-8'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">emojis</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">f</span><span class="p">)</span> <span class="c1"># Assume emojilib is downloaded as emojilib.json
</span>
<span class="n">mapping</span> <span class="o">=</span> <span class="n">defaultdict</span><span class="p">(</span><span class="nb">list</span><span class="p">)</span> <span class="c1"># A dict where each item defaults to []
</span>
<span class="k">for</span> <span class="n">emoji_name</span><span class="p">,</span> <span class="n">emoji</span> <span class="ow">in</span> <span class="n">emojis</span><span class="p">.</span><span class="n">items</span><span class="p">():</span>
<span class="n">mapping</span><span class="p">[</span><span class="n">emoji_name</span><span class="p">].</span><span class="n">append</span><span class="p">(</span><span class="n">emoji</span><span class="p">[</span><span class="s">'char'</span><span class="p">])</span>
<span class="k">for</span> <span class="n">part</span> <span class="ow">in</span> <span class="n">emoji_name</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="s">'_'</span><span class="p">):</span>
<span class="n">mapping</span><span class="p">[</span><span class="n">part</span><span class="p">].</span><span class="n">append</span><span class="p">(</span><span class="n">emoji</span><span class="p">[</span><span class="s">'char'</span><span class="p">])</span>
<span class="k">for</span> <span class="n">keyword</span> <span class="ow">in</span> <span class="n">emoji</span><span class="p">[</span><span class="s">'keywords'</span><span class="p">]:</span>
<span class="n">mapping</span><span class="p">[</span><span class="n">keyword</span><span class="p">].</span><span class="n">append</span><span class="p">(</span><span class="n">emoji</span><span class="p">[</span><span class="s">'char'</span><span class="p">])</span> <span class="c1"># Add extra terms into the mapping
</span>
<span class="k">print</span><span class="p">(</span><span class="n">mapping</span><span class="p">)</span> <span class="c1"># The generated mapping of words -> emoji
</span></code></pre></div></div>
<p>Then we get a nice mapping like this for categories:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>flag: 🇬🇧 🇸🇰 🇹🇨 🇲🇰 🇳🇫 🇧🇯 🇹🇷 🇬🇷 🇯🇴 🇰🇲 🇲🇾 🇹🇿 🇵🇭 🇨🇩 🇩🇲 🇬🇳 🇱🇷 🇨🇻 🇻🇬 🇲🇿 🇵🇱 🇻🇦
banner: 🇬🇧 🇸🇰 🇹🇨 🇲🇰 🇳🇫 🇧🇯 🇹🇷 🇬🇷 🇯🇴 🇰🇲 🇲🇾 🇹🇿 🇵🇭 🇨🇩 🇩🇲 🇬🇳 🇱🇷 🇨🇻 🇻🇬 🇲🇿 🇵🇱
country: 🇬🇧 🇸🇰 🇹🇨 🇲🇰 🇳🇫 🇧🇯 🇹🇷 🇬🇷 🇯🇴 🇰🇲 🇲🇾 🇹🇿 🇵🇭 🇨🇩 🇩🇲 🇬🇳 🇱🇷 🇨🇻 🇻🇬 🇲🇿 🇵🇱
nation: 🇬🇧 🇸🇰 🇹🇨 🇲🇰 🇳🇫 🇧🇯 🇹🇷 🇬🇷 🇯🇴 🇰🇲 🇲🇾 🇹🇿 🇵🇭 🇨🇩 🇩🇲 🇬🇳 🇱🇷 🇨🇻 🇻🇬 🇲🇿 🇵🇱
woman: 💑 👩👦 🧗♀️ 🧗♀️ 🚣♀️ 🚣♀️ 👩⚕️ 👩⚕️ 👩👩👧 👩👩👧 🧜♀️ 💂♀️ 🤦♀️ 🤦♀️ 👩🎨 👩🎨
nature: 🦑 🐩 🎐 🌕 🌷 🙈 🌾 🐺 🐑 🏔 🐊 🐏 🐄 🐡 🌚 🌓 🐘
man: 💑 🧝♂️ 🧝♂️ 🤷♂️ 🤷♂️ 👱 👱 🤽♂️ 💁♂️ 💁♂️ 🚣 🧘♂️ 🧘♂️ 🙋♂️ 🙋♂️
face: 😄 😤 😛 🤤 🤤 😢 👂 😕 😀 😬 😷 😁 🌚 🤬 😣 🙃 🙃 🤣
animal: 🦑 🐩 🙈 🐺 🐑 😽 🐊 🐈 🐏 🐄 🐡 🕊 🐘 🦈 🦅 🐤 🦒
human: 💑 👩👦 👩⚕️ 👩👩👧 👫 💁♂️ 👩🎨 👩👩👧👧 👩👧👧 👨🎤 👩🌾 ⛹ 👨👦 👵 🏊 👥 👨✈️ 👨❤️👨 👩👧👦
</code></pre></div></div>
<p>And this for individual emoji:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>lost: 🏳
moving: 📦
sterling: 💷
yummy: 😋
vegas: 🎰
suspension: 🚟
eat: 🍽
bolivarian: 🇻🇪
saucer: 🛸
second: 🥈
</code></pre></div></div>
<p>Wonderful! Now onto utilizing this!</p>
<h2 id="short-notice-making-gensim-more-efficient">Short notice: Making <code class="language-plaintext highlighter-rouge">gensim</code> more efficient</h2>
<p>I use the Python <code class="language-plaintext highlighter-rouge">gensim</code> library for parsing Google’s <code class="language-plaintext highlighter-rouge">.bin</code> file. However, when loading the dataset, <code class="language-plaintext highlighter-rouge">gensim</code> stores it all in memory. WIth my little 8 GB laptop already running a few browsers, things get a little tight (looking at you Electron). Actually extremely tight. Like Windows freezes as it tries to move memory pages to the disk in panic. Not to great. To rememdy this you can move all the vectors to a <a href="https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.memmap.html"><code class="language-plaintext highlighter-rouge">memmap</code></a>:</p>
<p><code class="language-plaintext highlighter-rouge">gensim</code> stores its data in two places: an array of all the words (<code class="language-plaintext highlighter-rouge">in</code>, <code class="language-plaintext highlighter-rouge">at</code>, etc.) and a 2D array of vectors, each being mapped to its word by the index. The code below saves both of these for later use (to avoid loading the full model into memory each time:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">gensim.models.keyedvectors</span> <span class="kn">import</span> <span class="n">KeyedVectors</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">KeyedVectors</span><span class="p">.</span><span class="n">load_word2vec_format</span><span class="p">(</span><span class="s">'GoogleNews-vectors-negative300.bin'</span><span class="p">,</span> <span class="n">binary</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'vocab.json'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">json</span><span class="p">.</span><span class="n">dump</span><span class="p">(</span><span class="n">model</span><span class="p">.</span><span class="n">index2word</span><span class="p">,</span> <span class="n">f</span><span class="p">)</span> <span class="c1"># Save the word strings
</span>
<span class="n">model</span><span class="p">.</span><span class="n">init_sims</span><span class="p">(</span><span class="n">replace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="c1"># Calculate all the unit vectors
</span><span class="n">norms</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">memmap</span><span class="p">(</span><span class="s">'normsmemmap.dat'</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">model</span><span class="p">.</span><span class="n">wv</span><span class="p">.</span><span class="n">vectors_norm</span><span class="p">.</span><span class="n">dtype</span><span class="p">,</span>
<span class="n">mode</span><span class="o">=</span><span class="s">'w+'</span><span class="p">,</span> <span class="n">shape</span><span class="o">=</span><span class="n">model</span><span class="p">.</span><span class="n">wv</span><span class="p">.</span><span class="n">vectors_norm</span><span class="p">.</span><span class="n">shape</span><span class="p">)</span>
<span class="n">norms</span><span class="p">[:]</span> <span class="o">=</span> <span class="n">model</span><span class="p">.</span><span class="n">wv</span><span class="p">.</span><span class="n">vectors_norm</span> <span class="c1"># Write the normed vectors to the memmap
</span><span class="k">del</span> <span class="n">model</span> <span class="c1"># Discard the model from memory
</span></code></pre></div></div>
<p>And to load it:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'vocab.json'</span><span class="p">,</span> <span class="s">'r'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">vocab</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">f</span><span class="p">)</span>
<span class="n">norms</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">memmap</span><span class="p">(</span><span class="s">'normsmemmap.dat'</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">float32</span><span class="p">).</span><span class="n">reshape</span><span class="p">((</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">300</span><span class="p">))</span>
</code></pre></div></div>
<h2 id="filtering-the-map">Filtering the map</h2>
<p>While Google’s dataset has 3 million words, emoji names can still be obscure enough to avoid appearance in the English language. For example, <code class="language-plaintext highlighter-rouge">shipit</code> is not yet a recognized as a word. Neither is <code class="language-plaintext highlighter-rouge">dog2</code>. Country names as lowercase also aren’t included. That requires a little extra parsing work:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">lowercasedvocab</span> <span class="o">=</span> <span class="p">[</span><span class="n">word</span><span class="p">.</span><span class="n">lower</span><span class="p">()</span> <span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">vocab</span><span class="p">]</span>
<span class="n">vocab_set</span> <span class="o">=</span> <span class="nb">set</span><span class="p">(</span><span class="n">vocab</span><span class="p">)</span> <span class="c1"># For faster searching
</span><span class="n">lower_vocab_set</span> <span class="o">=</span> <span class="nb">set</span><span class="p">(</span><span class="n">lowercasedvocab</span><span class="p">)</span> <span class="c1"># Again, faster searching
</span><span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="nb">list</span><span class="p">(</span><span class="n">mapping</span><span class="p">.</span><span class="n">keys</span><span class="p">()):</span>
<span class="k">if</span> <span class="n">word</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">vocab_set</span><span class="p">:</span>
<span class="c1"># Check to see if tje word appears as a different case
</span> <span class="k">if</span> <span class="n">word</span><span class="p">.</span><span class="n">lower</span><span class="p">()</span> <span class="ow">in</span> <span class="n">lower_vocab_set</span><span class="p">:</span>
<span class="n">cased_word</span> <span class="o">=</span> <span class="n">vocab</span><span class="p">[</span><span class="n">lowercasedvocab</span><span class="p">.</span><span class="n">index</span><span class="p">(</span><span class="n">word</span><span class="p">)]</span>
<span class="n">mapping</span><span class="p">[</span><span class="n">cased_word</span><span class="p">]</span> <span class="o">=</span> <span class="n">mapping</span><span class="p">[</span><span class="n">word</span><span class="p">]</span>
<span class="c1"># Remove the word from the mapping
</span> <span class="k">del</span> <span class="n">mapping</span><span class="p">[</span><span class="n">word</span><span class="p">]</span>
</code></pre></div></div>
<p>This process performs operations such as the following (✍ for rename, ✖ for remove):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>✍ papua → Papua
✖ couple_with_heart_woman_man
✖ funeral_urn
✖ man_elf
✍ kazakhstan → Kazakhstan
✖ drooling_face
✖ ideograph
✖ climbing_woman
✖ medal_military
✖ blonde_man
✖ straight face
✍ norfolk_island → NORFOLK_ISLAND
✖ rowing_woman
✖ oncoming_police_car
✖ highfive
✍ full_moon → Full_Moon
</code></pre></div></div>
<p>Here utilizing <code class="language-plaintext highlighter-rouge">set</code> makes a huge difference. At least a 5x gain in speed. It’s a lot faster to search a hash table than to iterate through a 3 million word array.</p>
<h2 id="filtering-the-word-vectors">Filtering the word vectors</h2>
<p>Google’s dataset is HUGE. It takes a ton of memory and a ton of time to load it, and in reality we only care about emoji. If words aren’t similar to emoji names, we don’t need them.</p>
<p>The code below uses <code class="language-plaintext highlighter-rouge">np.inner</code>, which is a nice way of computing dot products between the 3mil x 300 word2vec vocab and 2854 x 300 emoji vocab and giving a 3mil x 2854 array of the dot product results. You may see from this that we could transpose the emoji vocab array so we can do a matrix multiplication (<code class="language-plaintext highlighter-rouge">@</code> in numpy) between the new 3mil x 300 and 300 x 2854 arrays, but <code class="language-plaintext highlighter-rouge">np.inner</code> is a little simpler to call in my opinion.</p>
<p>Also, computing this 3mil x 2854 array in go would take a <em>lot</em> of memory. So we do it in chunks and write these chunks to a <code class="language-plaintext highlighter-rouge">memmap</code>. Using <code class="language-plaintext highlighter-rouge">amax</code> to find the maximum dot product for each word (i.e. find the closest each word is to our emoji vocab) also decreases the memory needed.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">vocab_norms</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">norms</span><span class="p">[</span><span class="n">vocab</span><span class="p">.</span><span class="n">index</span><span class="p">(</span><span class="n">word</span><span class="p">)]</span> <span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">mapping</span><span class="p">])</span>
<span class="n">dp</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">memmap</span><span class="p">(</span><span class="s">'dpmap.dat'</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">norms</span><span class="p">.</span><span class="n">dtype</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="s">'w+'</span><span class="p">,</span> <span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="n">norms</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],))</span>
<span class="n">CHUNKSIZE</span> <span class="o">=</span> <span class="mi">1000</span>
<span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">norms</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">CHUNKSIZE</span><span class="p">):</span>
<span class="n">e</span> <span class="o">=</span> <span class="n">s</span><span class="o">+</span><span class="n">CHUNKSIZE</span>
<span class="n">dp</span><span class="p">[</span><span class="n">s</span><span class="p">:</span><span class="n">e</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">amax</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">inner</span><span class="p">(</span><span class="n">norms</span><span class="p">[</span><span class="n">s</span><span class="p">:</span><span class="n">e</span><span class="p">],</span> <span class="n">vocab_norms</span><span class="p">),</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</code></pre></div></div>
<p>Now we can find things like the words least similar to our emoji vocab:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Word Similarity
------------------------- -----------
HuMax_IL8_TM 0.045590762
J.Gordon_##-### 0.05180111
By_DOUG_HAIDET 0.056708228
G.Biffle_###-### 0.068331175
K.Kahne_###-### 0.08244177
HuMax_TAC_TM 0.08358895
mso_para_margin_0in 0.08385273
Nasdaq_NASDAQ_TRIN 0.08743682
BY_ROBERTO_ACOSTA 0.08953415
Globalization_KEY_FACTORS 0.09093615
</code></pre></div></div>
<p>At least it seems Google <code class="language-plaintext highlighter-rouge">#</code>ed out personal information. That’s nice. I have no idea why people’s names ended up in the dataset in the first place though.</p>
<p>And things like the most similar words:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Word Similarity
--------------- ----------
Senegal 1.0000011
bearded 1.000001
drink 1.000001
urn 1.000001
industry 1.0000008
fly 1.0000008
training 1.0000008
organizing 1.0000008
Macao 1.0000007
dark_sunglasses 1.0000007
</code></pre></div></div>
<p>These dot products should never be greater than one. I’ll blame it on floating point imprecision.</p>
<p>I’ll choose a random cutoff (let’s say <code class="language-plaintext highlighter-rouge">0.5</code>), and keep only the words with a similarity greater than or equal to <code class="language-plaintext highlighter-rouge">0.5</code>:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">good_indices</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">where</span><span class="p">(</span><span class="n">dp</span> <span class="o">></span> <span class="mf">0.5</span><span class="p">)</span>
<span class="n">good_vocab</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">vocab</span><span class="p">)[</span><span class="n">good_indices</span><span class="p">]</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'good_vocab.txt'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="s">'utf-8'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">good_vocab</span><span class="p">:</span>
<span class="n">f</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">word</span> <span class="o">+</span> <span class="s">' '</span><span class="p">)</span>
<span class="n">good_norms</span> <span class="o">=</span> <span class="n">norms</span><span class="p">[</span><span class="n">good_indices</span><span class="p">]</span>
<span class="n">np</span><span class="p">.</span><span class="n">save</span><span class="p">(</span><span class="s">'good_weights.npy'</span><span class="p">,</span> <span class="n">good_norms</span><span class="p">)</span>
</code></pre></div></div>
<p>This yields about a 570 MB file (you can cut that down to 144 MB with a threshold of <code class="language-plaintext highlighter-rouge">0.6</code>). That’s somewhat more manageable! Gzipped it’s about 440 MB, which is still on the large side.</p>
<h2 id="using-the-filtered-vectors">Using the filtered vectors</h2>
<p>Now to use the data:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Reform the list of words and normalized vectors pointing to words in the emoji map
</span><span class="n">lookup</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">word</span><span class="p">:</span> <span class="n">good_weights</span><span class="p">[</span><span class="n">good_vocab</span><span class="p">.</span><span class="n">index</span><span class="p">(</span><span class="n">word</span><span class="p">)]</span>
<span class="n">emoji_words</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">mapping</span><span class="p">.</span><span class="n">keys</span><span class="p">())</span>
<span class="n">emoji_norms</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">lookup</span><span class="p">(</span><span class="n">word</span><span class="p">)</span> <span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">emoji_words</span><span class="p">])</span>
<span class="c1"># Calculate similarities (dot products) between
</span><span class="n">dot</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">dot</span><span class="p">(</span><span class="n">emoji_norms</span><span class="p">,</span> <span class="n">lookup</span><span class="p">(</span><span class="s">'ecstatic'</span><span class="p">))</span>
<span class="n">matches</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">argpartition</span><span class="p">(</span><span class="n">dot</span><span class="p">,</span> <span class="o">-</span><span class="mi">10</span><span class="p">)[</span><span class="o">-</span><span class="mi">10</span><span class="p">:]</span>
<span class="n">sortedmatches</span> <span class="o">=</span> <span class="n">matches</span><span class="p">[</span><span class="n">np</span><span class="p">.</span><span class="n">argsort</span><span class="p">(</span><span class="n">dot</span><span class="p">[</span><span class="n">matches</span><span class="p">])][::</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
<span class="k">for</span> <span class="n">index</span> <span class="ow">in</span> <span class="n">sortedmatches</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="n">dot</span><span class="p">[</span><span class="n">index</span><span class="p">],</span> <span class="n">emoji_words</span><span class="p">[</span><span class="n">index</span><span class="p">],</span> <span class="s">' '</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">mapping</span><span class="p">[</span><span class="n">emoji_words</span><span class="p">[</span><span class="n">index</span><span class="p">]]))</span>
</code></pre></div></div>
<p>This gives:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0.6626913 happy 😄 😀 😁 😃 😊 😹 😆 😋 🌈 😂 😉 😺 😅
0.57641125 disappointed 😥 🙁 😞 😞
0.5699152 surprised 😲
0.5558953 glad 😆
0.55013615 shocked 🤯
0.543222 astonished 😲 😲
0.54264677 proud 😤
0.53669024 stunned 😧
0.5175118 awesome ✨ ❇️ 🌟 👍
0.51358485 relieved 😥 😌 😌
</code></pre></div></div>
<p>Not bad! I’ve never heard of <code class="language-plaintext highlighter-rouge">disappointed</code> being a synonyms for <code class="language-plaintext highlighter-rouge">ecstatic</code>, but everything else makes sense.</p>
<p>In part 2, I’ll move to JavaScript so we can get a nicer emoji search in the browser.</p>rianadonEmoji vectors and optimization. Oh my! (Part 1: Word2Vec and Python) A while back ago, the Slack bot EmojiBot went offline, endangering my workflow. You could ask it to find an emoji by sending a query like @emojibot electricity, and it would reply with emoji such as :electric_plug:, :zap:, :bulb:, and :battery:. But all was okay because in the meantime, I built up my own replacement that was a Python http server connected to a lovely little Hubot instance. This post details how I went about doing that. Before I move on, a few basics on the overarching question: how does one match up text to emoji? One great project I encountered was emojilib. The project contains a mapping of emoji to a few descriptive key words each. You can find a search interface for it here. It works spectacularly for most emoji, but once you pull out a thesaurus things get worse. For example: ghost finds :ghost:, but ghoul turns up nothing. Happy finds a long list of emoji (:grinning:, :grin:, :joy:, and :smiley: to name a few), but ecstatic finds none. And nothing comes up for Ryan, my very own name, either! We can do better. Google has word vectors of 300 million words in its Google News word2vec dataset! Surely there must be a way to utilize this. And there is. There are already methods to make a word2vec model out of emoji, such as the one detailed by this paper. And while the word2vec training methods are amazingly ingenious and interesting, I decided to not dive into the depths of AI deep neural network machine learning and rather play around with Google’s word embeddings and emojilib. With word2vec, you can find the similarity of two words by finding the angle between their two vectors. This is the same as taking the arc cosine dot product of their unit vectors, However, we can deal without the trigonometry and just take dot products, which gives similarities between 0 and 1. With this method, I can build a way to relate fancy words (e.g. ecstatic) -> simple words (e.g. happy) -> emoji, and will be able to build a pretty good emoji searcher! Building a Map Google’s word vectors are for words, not emoji. So I need a way to translate words to emoji. Luckily, this is easy to do with emojilib and a little Python: with open('emojilib.json', 'r', encoding='utf-8') as f: emojis = json.load(f) # Assume emojilib is downloaded as emojilib.json mapping = defaultdict(list) # A dict where each item defaults to [] for emoji_name, emoji in emojis.items(): mapping[emoji_name].append(emoji['char']) for part in emoji_name.split('_'): mapping[part].append(emoji['char']) for keyword in emoji['keywords']: mapping[keyword].append(emoji['char']) # Add extra terms into the mapping print(mapping) # The generated mapping of words -> emoji Then we get a nice mapping like this for categories: flag: 🇬🇧 🇸🇰 🇹🇨 🇲🇰 🇳🇫 🇧🇯 🇹🇷 🇬🇷 🇯🇴 🇰🇲 🇲🇾 🇹🇿 🇵🇭 🇨🇩 🇩🇲 🇬🇳 🇱🇷 🇨🇻 🇻🇬 🇲🇿 🇵🇱 🇻🇦 banner: 🇬🇧 🇸🇰 🇹🇨 🇲🇰 🇳🇫 🇧🇯 🇹🇷 🇬🇷 🇯🇴 🇰🇲 🇲🇾 🇹🇿 🇵🇭 🇨🇩 🇩🇲 🇬🇳 🇱🇷 🇨🇻 🇻🇬 🇲🇿 🇵🇱 country: 🇬🇧 🇸🇰 🇹🇨 🇲🇰 🇳🇫 🇧🇯 🇹🇷 🇬🇷 🇯🇴 🇰🇲 🇲🇾 🇹🇿 🇵🇭 🇨🇩 🇩🇲 🇬🇳 🇱🇷 🇨🇻 🇻🇬 🇲🇿 🇵🇱 nation: 🇬🇧 🇸🇰 🇹🇨 🇲🇰 🇳🇫 🇧🇯 🇹🇷 🇬🇷 🇯🇴 🇰🇲 🇲🇾 🇹🇿 🇵🇭 🇨🇩 🇩🇲 🇬🇳 🇱🇷 🇨🇻 🇻🇬 🇲🇿 🇵🇱 woman: 💑 👩👦 🧗♀️ 🧗♀️ 🚣♀️ 🚣♀️ 👩⚕️ 👩⚕️ 👩👩👧 👩👩👧 🧜♀️ 💂♀️ 🤦♀️ 🤦♀️ 👩🎨 👩🎨 nature: 🦑 🐩 🎐 🌕 🌷 🙈 🌾 🐺 🐑 🏔 🐊 🐏 🐄 🐡 🌚 🌓 🐘 man: 💑 🧝♂️ 🧝♂️ 🤷♂️ 🤷♂️ 👱 👱 🤽♂️ 💁♂️ 💁♂️ 🚣 🧘♂️ 🧘♂️ 🙋♂️ 🙋♂️ face: 😄 😤 😛 🤤 🤤 😢 👂 😕 😀 😬 😷 😁 🌚 🤬 😣 🙃 🙃 🤣 animal: 🦑 🐩 🙈 🐺 🐑 😽 🐊 🐈 🐏 🐄 🐡 🕊 🐘 🦈 🦅 🐤 🦒 human: 💑 👩👦 👩⚕️ 👩👩👧 👫 💁♂️ 👩🎨 👩👩👧👧 👩👧👧 👨🎤 👩🌾 ⛹ 👨👦 👵 🏊 👥 👨✈️ 👨❤️👨 👩👧👦 And this for individual emoji: lost: 🏳 moving: 📦 sterling: 💷 yummy: 😋 vegas: 🎰 suspension: 🚟 eat: 🍽 bolivarian: 🇻🇪 saucer: 🛸 second: 🥈 Wonderful! Now onto utilizing this! Short notice: Making gensim more efficient I use the Python gensim library for parsing Google’s .bin file. However, when loading the dataset, gensim stores it all in memory. WIth my little 8 GB laptop already running a few browsers, things get a little tight (looking at you Electron). Actually extremely tight. Like Windows freezes as it tries to move memory pages to the disk in panic. Not to great. To rememdy this you can move all the vectors to a memmap: gensim stores its data in two places: an array of all the words (in, at, etc.) and a 2D array of vectors, each being mapped to its word by the index. The code below saves both of these for later use (to avoid loading the full model into memory each time: from gensim.models.keyedvectors import KeyedVectors model = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True) with open('vocab.json', 'w') as f: json.dump(model.index2word, f) # Save the word strings model.init_sims(replace=True) # Calculate all the unit vectors norms = np.memmap('normsmemmap.dat', dtype=model.wv.vectors_norm.dtype, mode='w+', shape=model.wv.vectors_norm.shape) norms[:] = model.wv.vectors_norm # Write the normed vectors to the memmap del model # Discard the model from memory And to load it: with open('vocab.json', 'r') as f: vocab = json.load(f) norms = np.memmap('normsmemmap.dat', dtype=np.float32).reshape((-1, 300)) Filtering the map While Google’s dataset has 3 million words, emoji names can still be obscure enough to avoid appearance in the English language. For example, shipit is not yet a recognized as a word. Neither is dog2. Country names as lowercase also aren’t included. That requires a little extra parsing work: lowercasedvocab = [word.lower() for word in vocab] vocab_set = set(vocab) # For faster searching lower_vocab_set = set(lowercasedvocab) # Again, faster searching for word in list(mapping.keys()): if word not in vocab_set: # Check to see if tje word appears as a different case if word.lower() in lower_vocab_set: cased_word = vocab[lowercasedvocab.index(word)] mapping[cased_word] = mapping[word] # Remove the word from the mapping del mapping[word] This process performs operations such as the following (✍ for rename, ✖ for remove): ✍ papua → Papua ✖ couple_with_heart_woman_man ✖ funeral_urn ✖ man_elf ✍ kazakhstan → Kazakhstan ✖ drooling_face ✖ ideograph ✖ climbing_woman ✖ medal_military ✖ blonde_man ✖ straight face ✍ norfolk_island → NORFOLK_ISLAND ✖ rowing_woman ✖ oncoming_police_car ✖ highfive ✍ full_moon → Full_Moon Here utilizing set makes a huge difference. At least a 5x gain in speed. It’s a lot faster to search a hash table than to iterate through a 3 million word array. Filtering the word vectors Google’s dataset is HUGE. It takes a ton of memory and a ton of time to load it, and in reality we only care about emoji. If words aren’t similar to emoji names, we don’t need them. The code below uses np.inner, which is a nice way of computing dot products between the 3mil x 300 word2vec vocab and 2854 x 300 emoji vocab and giving a 3mil x 2854 array of the dot product results. You may see from this that we could transpose the emoji vocab array so we can do a matrix multiplication (@ in numpy) between the new 3mil x 300 and 300 x 2854 arrays, but np.inner is a little simpler to call in my opinion. Also, computing this 3mil x 2854 array in go would take a lot of memory. So we do it in chunks and write these chunks to a memmap. Using amax to find the maximum dot product for each word (i.e. find the closest each word is to our emoji vocab) also decreases the memory needed. vocab_norms = np.array([norms[vocab.index(word)] for word in mapping]) dp = np.memmap('dpmap.dat', dtype=norms.dtype, mode='w+', shape=(norms.shape[0],)) CHUNKSIZE = 1000 for s in range(0, norms.shape[0], CHUNKSIZE): e = s+CHUNKSIZE dp[s:e] = np.amax(np.inner(norms[s:e], vocab_norms), axis=1) Now we can find things like the words least similar to our emoji vocab: Word Similarity ------------------------- ----------- HuMax_IL8_TM 0.045590762 J.Gordon_##-### 0.05180111 By_DOUG_HAIDET 0.056708228 G.Biffle_###-### 0.068331175 K.Kahne_###-### 0.08244177 HuMax_TAC_TM 0.08358895 mso_para_margin_0in 0.08385273 Nasdaq_NASDAQ_TRIN 0.08743682 BY_ROBERTO_ACOSTA 0.08953415 Globalization_KEY_FACTORS 0.09093615 At least it seems Google #ed out personal information. That’s nice. I have no idea why people’s names ended up in the dataset in the first place though. And things like the most similar words: Word Similarity --------------- ---------- Senegal 1.0000011 bearded 1.000001 drink 1.000001 urn 1.000001 industry 1.0000008 fly 1.0000008 training 1.0000008 organizing 1.0000008 Macao 1.0000007 dark_sunglasses 1.0000007 These dot products should never be greater than one. I’ll blame it on floating point imprecision. I’ll choose a random cutoff (let’s say 0.5), and keep only the words with a similarity greater than or equal to 0.5: good_indices = np.where(dp > 0.5) good_vocab = np.array(vocab)[good_indices] with open('good_vocab.txt', 'w', encoding='utf-8') as f: for word in good_vocab: f.write(word + ' ') good_norms = norms[good_indices] np.save('good_weights.npy', good_norms) This yields about a 570 MB file (you can cut that down to 144 MB with a threshold of 0.6). That’s somewhat more manageable! Gzipped it’s about 440 MB, which is still on the large side. Using the filtered vectors Now to use the data: # Reform the list of words and normalized vectors pointing to words in the emoji map lookup = lambda word: good_weights[good_vocab.index(word)] emoji_words = list(mapping.keys()) emoji_norms = np.array([lookup(word) for word in emoji_words]) # Calculate similarities (dot products) between dot = np.dot(emoji_norms, lookup('ecstatic')) matches = np.argpartition(dot, -10)[-10:] sortedmatches = matches[np.argsort(dot[matches])][::-1] for index in sortedmatches: print(dot[index], emoji_words[index], ' '.join(mapping[emoji_words[index]])) This gives: 0.6626913 happy 😄 😀 😁 😃 😊 😹 😆 😋 🌈 😂 😉 😺 😅 0.57641125 disappointed 😥 🙁 😞 😞 0.5699152 surprised 😲 0.5558953 glad 😆 0.55013615 shocked 🤯 0.543222 astonished 😲 😲 0.54264677 proud 😤 0.53669024 stunned 😧 0.5175118 awesome ✨ ❇️ 🌟 👍 0.51358485 relieved 😥 😌 😌 Not bad! I’ve never heard of disappointed being a synonyms for ecstatic, but everything else makes sense. In part 2, I’ll move to JavaScript so we can get a nicer emoji search in the browser.First!2018-06-16T13:12:00+00:002018-06-16T13:12:00+00:00https://rianadon.github.io/blog/jekyll/udate/2018/06/16/first-post<h1 id="61618">6/16/18</h1>
<p>First post! Yay! I’m starting a blog! Yippee!</p>
<p>Now that that’s all out I can get onto writing. First actual post coming soon!</p>rianadon6/16/18 First post! Yay! I’m starting a blog! Yippee! Now that that’s all out I can get onto writing. First actual post coming soon!