project: sonasky

Note: This project has been updated since the original publish date of 2024-08-17. See Updates for details.

about

SonaSky is a Bluesky Labeler providing fursona species account labels for furry users on the platform.

A screenshot of one of Astra's Bluesky posts showing several labels applied to their account. The first two are from another labeler for pronouns, but the third says "Rabbit", denoting that is Astra's fursona species. The first two labels are pronouns, from @adorable.mom’s pronoun labeler. Then, the third is from SonaSky (what a surprise, I labeled myself “Rabbit”! Who could have guessed?!)

how it works

On the backend, there are a few components working to run this labeler. At the core is Bluesky’s Ozone, a “web interface for labeling content in atproto / Bluesky”. I used this to convert the sonasky Bluesky account into a labeler service service and announce it to the network. This tool is great for manually reviewing and adding labels, but I didn’t want to have to assign labels by hand for everyone who wanted to use this.

That’s why, next, I wrote a TypeScript bot (running in a Docker container) that subscribes to the Bluesky firehose and watches for any like/unlike events. If the event is a “like” event, there is a filter to check if the post being liked was made by the sonasky account, and if the post text starts with “Species: “. It’ll make sense why in a moment. The unlike events do not seem to contain details on the account that made the post that’s being unliked, but they do contain the unliking user’s DID + a unique ID that matches an ID in the “like” event. So when the user performs the initial like, a record is added to a Postgres database with the unique ID and some other metadata like timestamp, user DID, and what species label picked.

So, why “Species: “ as a prefix? I wanted maintaining this and adding new labels to be as easy and low-effort for myself as possible, because I kenw if folks did decide to use this service, I did not create all the labels I’d need at the start. So I can easily define a new label by creating a post as the sonasky account in the format:

Species: <species name but no hyphens>
// Optional comments for making searching posts easier.

Once the post is created, all I have to do is like and then unlike it to create the label. The Typescript bot listening for likes will pick it up, see that the label doesn’t exist yet on the labeler, and creates a new label with a default display name and description (display name is the species name as entered with capitalization matching the Bluesky post, and description is “This user’s fursona is a(n) …”. The “n” is added dynamically if the species starts with a vowel. Fancy~). Users can like the post that contains a species label they want, and it’ll be applied almost immediately. They can then unlike it to remove that label.

I specifically note to avoid using hyphens in the post because Bluesky label IDs cannot include spaces (or underscores, it turns out, based on my experience/debugging). I find/replace all spaces in the species name with hyphens to deal with this.

I admit, there’s probably easier ways I could have done this than forbid hyphens. I suppose I could have just removed the spaces for the label ID and then hyphens wouldn’t interfere. Generally speaking, I felt okay with this workaround because… well for one I didn’t think about just removing all spaces entirely for the ID field, that would have been smart, but secondly because not every species has a hyphen in its common name. Things are fairly flexible when it comes to including a hyphen in a common name, like “red-tailed hawk” or “blue-ringed octopus”; these are included for readability rather than a strict rule though. (If we’re talking scientific names of species, then we get really strict; the International Code of Zoological Nomenclature (ICZN) has some very specific rules about when hyphens can be used, if at all — in most cases, not — in scientific names).

Anyway, if a species’s common name is written with a hyphen, I’ll typically just go into the Ozone UI, pull up the label JSON data, and manually add it back in myself.

using the labeler and making that easier

I don’t think I mentioned yet that I did most of this work between 9pm on a Wednesday through 2am on a Thursday (I think; it was late, time is blurry). I made some labels for some common/popular species based on FurScience’s most popular species graph, thinking “this’ll do for now” and promptly falling asleep.

I woke up the next day to find… like 5 people maybe, max, including myself, used it. Not quite the “this blew up” story you might have expected, based on how I built that up. So I went to work. Then I got home, and that’s when I noticed all my notifications! Lots of requests for new species. It was really exciting to see so much enthusiasm right at the start of this project.

I spent the evening catching up on every request I saw in the replies to the announcement post, and very quickly there were a lot more labels. So much so that it was actually kind of hard to find existing ones you knew you wanted. At first, I had made a Bluesky feed (using Skyfeed) containing just posts starting with “Species: “ and sorted by popularity, but this too was difficult to browse/search.

So, I quickly threw together a couple of new components to this contraption. First, any time a new label was added by the Typescript bot, dump the entire label service out to JSON, and store on disk. Then, using a very-quickly-thrown-together React web app (React because I knew I could make something SUPER fast and simple, and also that looked nice using Material UI), display that data in a table (mui-datatables, to be precise) with a search bar.

I tweaked this for a bit, eventually adjusted the theme to be in “dark mode”, and eventually added some parsing in for the descriptions so I could add metadata-like fields to the table without having to actually modify the JSON schema that Ozone requires. For example, " ... [Category: Mustelid] ..." in the description will add a key to the array of objects feeding the datatable, like {... category: "Mustelid"}.

To browse sonasky labels, see https://sonasky-browse.bunnys.ky/.

conclusion

This project was a really fun time to make, and a great learning experience. I ended up having to learn quite a bit about the AT Protocol and how to interact with it, and I appreciated getting to think through how to design this so it was easy to use and easy to maintain.

Now, about a week after creation (at time of writing), more than 1,000 users have sona labels on their accounts! I am so happy to see folks using this, and even more happy to see all the different (almost entirely positive) responses and posts about this tool. Especially over the first few days, every time I saw someone like a species post or request a new one, I’d check out their account, and a lot of people had posted some really nice comments about the labeler and what it meant to them. I’m most thankful for this opportunity to help continue fostering the furry community presence on Bluesky. (Maybe that sounds a bit grandiose, but hey, just let me have this. I’m proud of a thing I did!)

Updates

Update 2024-08-21

Made some updates to the code to make the bot a bit more resilient to network issues.

For this update, I switched the library I was using to listen to the Bluesky firehose from atproto-firehose to @skyware/firehose (shoutout to @adorable.mom for using it in their pronouns labeler, which I saw they had the code up for!).

The skyware library has a nice feature to resume the firehose from a given checkpoint; @adorable.mom had a nice implementation to store a checkpoint of the cursor value every minute. I updated sonasky to store the cursor every 60 seconds and resume from the most recent checkpoint on startup. Then, on any errors reading the firehose, just crash exit the program and docker will restart it and pick up where it left off, rather than, oh I dunno, sitting idle for 12 hours and missing a ton of label assignments. Who would let that happen? Surely not me… *sweats*

Anyway, code cleanup is going alright. Need to do a little bit more tidying, and I hope to publish what I’ve got soon (much in the spirit of @adorable.mom’s pronouns one, because that really helped me with the whole losing-connection-to-firehose-issue).

Permalink to this update

Update 2024-09-05

Bunch of small updates made over the last week or two! The entirety of Brazil joined Bluesky, so things got slow for a bit! To keep things feeling welcome and inclusive, I set up sonasky-labels-localization, a repo to coordinate translating the English labels/descriptions into Brazilian Portuguese (and, hopefully, more languages in the future!) Thank you to the couple of folks who have been helping with translation so far. At the time of writing, about 60% of the labels are translated into pt-BR!

I added in some logic to store the human readable timestamp of the cursor position to the SonaSky browse site (as well as relative time using dayjs), and I added logic to the bot to check the delay time every 10 minutes and if it ever slips past a 5 minute delay, update the display name of the Bluesky profile (also, no matter what, the profile description gets updated with the cursor position/delay details).

I also added a table to the database to store the post links so the “Go to Bluesky” links on SonaSky browse can go directly to them rather than using a mostly-functional Bluesky search.

Finally, I tidied up the code to be good-enough. There’s some stuff that’s hardcoded like the profile description, but whatever. It’s up on Github now!

Permalink to this update

Update 2024-09-15

Apparently, there’s a limit to how much data you can send to Bluesky in the label definitions field. I get “Error: request entity too large” whenever I try to push new/updated label definitions, which is a problem.

I’ve opened a bug with Bluesky to see if anyone can look into it:

Ozone/Labeler Update Fails with “Request entity too large” when adding large number of labels/localizations #2803

Permalink to this update

Update 2024-10-25

After hitting that maximum data size limit a few more times, I split SonaSky into two labelers because the biggest category of labels by far was Pokemon. The same bot controls both labellers, and the definition file for the labels denotes which labeller serves that label.

SonaSky "v2" Diagram showing that there are two labelers being controlled by the same bot

Permalink to this update

read more

Bluesky’s Stackable Approach to Moderation (March 12, 2024, by The Bluesky Team)

GitHub: bluesky-social/ozone