For associated code, please see the github repo. Huge shoutout to my old friend Daniel Tse, linguist and ML expert extraordinaire for invaluable help and ideas on both fronts throughout this campaign.
In this article, we build a text-to-image AI that learns Chinese characters in the same way humans do - by understanding what their components mean. It can then invent new characters based on the meaning of an English prompt.
For associated code, please see the Jupyter notebook in the github repository
While machine learning has been very successful in image-processing applications, there’s still a place for traditional computer vision techniques. One of my hobbies is making timelapse videos by taking photos every 10 minutes for many days, weeks or even months. Over this time scale, environmental effects such as thermal expansion from the day-night cycle can introduce period offsets into the footage.
Generative text-to-image models have recent become very popular. Having a bunch of fully captioned images left over from the This JACS Does Not Exist project, I’ve trained a Stable Diffusion checkpoint on that dataset (~60K JACS table-of-contents images with matched paper titles). It seems to work pretty well. Here are some examples of prompts and the resulting images generated:
“Development of a Highly Efficient and Selective Catalytic Enantioselective Hydrogenation for Organic Synthesis” “Lead-free Cs2AgBiBr6 Perovskite Solar Cells with High Efficiency and Stability” “A Triazine-Based Covalent Organic Framework for High-Efficiency CO2 Capture” “The Design and Synthesis of a New Family of Small Molecule Inhibitors Targeting the BCL-2 Protein” Running the model The fun of generative models is in running it yourself of course.
For the language-to-language components of this JACS does not exist, I chose Google’s T5 (text-to-text transfer transformer) as a recent cutting-edge text sequence to sequence model.
I had already scraped all the JACS titles and abstracts, so training data was readily available. The first task was to generate somewhat-convincing abstracts from titles to increase the entertainment value of TJDNE.
As abstracts have a maximum length, I wanted to make sure that a whole abstract would be included in T5’s maximum input length of 512 tokens so that end-of-sequence locations could be determined.
Many people enjoyed title2abstract from the this JACS does not exist project so I inverted the training parameters for a quick follow up. Presenting abstract2title: You can also test the title2abstract and toc2title apps in my previous post.
An imaginary abstract generated at thisJACSdoesnotexist.com In academic chemistry, authors submit a promotional table-of-contents (ToC) image when publishing research papers in most journals. These fascinate me as they are one of the few places where unfettered self expression is tolerated, if not condoned. (See e.g. TOC ROFL, of which I am a multiple inductee)
In general, though, ToC images follow a fairly consistent visual language, with distinct conventions followed in different subfields.