Expanding a Tiny(mce) home

For an open source project as old and well-known as TinyMCE the primary repository is more than just a collection of code; it’s home base, the first place to look for help and the foundation for everyone who forks or contributes to the project.

So you can imagine that making perhaps the biggest change this repository has ever seen was quite daunting. What started as a simple proposal became an evolving background priority task that has taken 5 months for me to become confident it is stable.

As TinyMCE transitioned to a modern codebase through 2017 and 2018 many external library dependencies were added from previously closed source projects. This multi-repository development style worked well enough in small teams, but as we grew the team on TinyMCE 5 we hit the pain points harder and harder slowing the team down significantly. It’s a common story in “I switched to a monorepo” blog posts.

As TinyMCE 5 wound through beta and release candidate at the end of 2018 I decided enough was enough. In consultation with Spocke the decision was made to bring our 22 library projects together alongside TinyMCE in a monorepo. I volunteered to take this on as a pet project, expecting the scope of changes in TinyMCE 5 to take priority and it would be a long slow burn.

I don’t want to create another tedious “how to monorepo” article, but I do want to give a high-level overview of why and how we did it for posterity and conversation.

Based on my expectation of delays I started with the best decision of the entire project; I did not build the monorepo by hand. I used a script to do the heavy lifting and import both the existing code and templates of new files. This script could rebuild my monorepo fork based on the master branch of TinyMCE and 22 source repositories in just 2 minutes. This gave me freedom to progress, experiment and iterate in my own little sandbox while also keeping it up to date.

Early on we decided to convert the TinyMCE repository rather than start a new one. The popularity of the project meant we could not break the master branch; the whole monorepo update for our developers had to come down with a single ​git pull. Next, we settled on Lerna as the basis for our monorepo. It is fairly well known and seemed to be strongly recommended.

Side note: the decision to use modules instead of the Lerna default packages is a whole other tale, one I tried to cover in the TinyMCE readme. It’s quite possible we will eventually drop Lerna, a yarn workspace gives us most of the benefits I was looking for, and there are definitely stories of people outgrowing Lerna. But for now it’s working well.

I quickly abandoned the low-level management functions and settled on Yarn workspaces instead, but Lerna’s help with publishing independently versioned modules is essential as we wished to continue publishing the libraries even after their source code was merged into the monorepo.

The base setup of my monorepo script was fairly simple:

  • yarn config set workspaces-experimental true
  • mkdir -p modules/tinymce && mv .gitignore * modules/tinymce
  • create new .gitignore
  • lerna init
  • Update lerna.json
    • "version": "independent",
      "npmClient": "yarn",
      "useWorkspaces": true
  • Update the default package.json created by lerna:
    • "workspaces": ["modules/*"]

 

From there I needed to start adding the package repositories into the monorepo. There are many and varied resources to help with creating a monorepo but the conversion I had planned was more complicated than most articles I found on the subject.

My initial attempt followed the simplest possible approach, git subtree:
git subtree add -P modules/name ../name master

This retained the commit log history, but it didn’t show file diffs in each commit (I guess due to the file location changes?). I’m not sure what the use case for this command is.

My second attempt was with Lerna’s import command. This filtered every commit, making it very very slow (8867 commits across 22 repositories) and the resulting git history structure did not impress me.

After digging through more articles, and finding the right words to search for, I came across a detailed approach to merging git repositories without losing file history by Eric Lee (via his Stack Overflow post).

This technique checks out the source repository master, moves all files into a subfolder, then merges that into the monorepo master. SeemsGood. I only needed to make small adjustments to this process, I can post the script if there are requests but most of the details are specific to TinyMCE and our internal git server.

Once the repositories are imported my script uses sed and a few other tricks to adjust common tooling so it works inside the monorepo:

  • Move all devDependencies to the monorepo root to avoid diverging versions
  • Switch all packages to TypeScript project references and build mode
  • Switch all packages to LGPL, matching TinyMCE (most of the source projects were Apache 2.0 licensed)
  • ./node_modules/.bin/cmd and npx cmd no longer work
    • This was a simple fix, we now use yarn cmd
  • load-grunt-tasks found no tasks to load because it required the task modules be in a local node_modules but the monorepo moved all of those to the root. So I had to get creative:
    • require('load-grunt-tasks')(grunt, {
        requireResolution: true,
        config: "../../package.json",
        pattern: ['grunt-*', 'rollup']
      });
  • grunt.loadNpmTasks() also found no tasks to load
    • This was deleted, replaced by pattern additions to the load-grunt-tasks config.

From here I made constant tweaks to the import script and began finding issues I could fix in the source repositories instead of patching with my script:

  • Repositories with CRLF in their files (TinyMCE standardises on LF). The way git merge is configured by the script it was performing renames + line ending conversions but only committed the rename. This did not happen often so was easy to fix when it came up.
  • In late 2018 we built a more modern API for our AJAX library, jax, but only deployed it to premium plugins. We decided that rather than standardise the monorepo on the old API, we would completely replace the open source jax with this new code.
  • I took the opportunity to use BFG Repo-Cleaner to strip out binary history from two projects before they were imported. This brought the monorepo git pull size down from 23MB to 11MB.
  • We use webpack a lot in development, and for a long time have relied on awesome-typescript-loader. We are big fans of atl, and we still use it for the demo environment, but testing in the monorepo was just too slow without support for TypeScript project references. So we switched testing to ts-loader, which does support them, via a seemingly innocuous commit message 😉
  • The way our test framework imported tests, when applied to the whole monorepo, produced a 19MB JavaScript file and a fun TypeScript error:
    TS2563: The containing function or module body is too large for control flow analysis.
    This one turned out to be easy to fix and sped up all projects, so in it went to the open source project months ago.
  • I sneakily deployed a monorepo-specific change to resource loading in tests. The test framework added resource aliasing for yarn workspaces, with an otherwise pointless local alias, except I switched TinyMCE tests to make use of the local alias allowing all tests to run inside the monorepo without patching from the script.

I poked and prodded at this on and off for a couple of months until I had my monorepo dev environment working, the tests all passed, and I could start thinking about versioning / publishing. I’m the sort of person who doesn’t totally trust documentation; I want to mess around and explore the commands to see what they do.

This required extreme caution. I had imported live NPM module source code so a rogue npm publish could have dire consequences. After spending some time logged out of NPM to be safe I hit upon Lerna’s publishConfig setting which let me constrain all publishing to a specific NPM registry.

I used a local Sonatype Nexus Docker container along with a fork of TinyMCE to give me complete freedom to publish builds and explore the publish styles Lerna offers while playing in my sandbox.

We publish to NPM from CI very regularly, so at first glance from-package seemed like the best path forward to match how we had been developing. After some discussion we decided to switch to automated lerna publish patch. Independent versioning on 22 packages will potentially create hundreds of version tags a year but we love automation and cleaning up a tag mess can be scripted. We do still use from-package to account for manually updating minor and major versions, but we are hoping to explore conventional commits to later automate the entire release process.

As a final step I leveraged lerna changed to create a new CI script that only runs tests on changed packages. This reduces CI build times and further improves the iteration speed of developing in the monorepo.

Five months in the planning, and after a week of internal testing, nearly 9000 new commits and the TinyMCE monorepo are now live. I’ve had a lot of fun building it and I hope it works as well for our contributors as it has already done for our own development.

A big TinyMCE shakeup

Today marked the release of TinyMCE 5 and the launch of our shakeup of the project with the hope to see it modernise and move forward into the future. This shakeup, from my perspective, comes in four parts.

People

The first big change is that version 5 was developed under the stewardship of the Editor Platform team in Brisbane, Australia, where I am the Technical Lead. Our developers in Sweden are still involved – TinyMCE wouldn’t be TinyMCE without them – but now the team assigned to TinyMCE 5 is triple the size we had previously (and there was an expansion to 4x at one point during the project). Everyone is excited about the things this makes possible.

FP

The second big change is the switch to a more rigorous Functional Programming style of development, not only in the code but also the project. We have many small self contained libraries that we now compose together particularly in the new silver theme.

The team in Brisbane has been working in this way for a number of years while developing our Textbox.io editor technology; moving onto TinyMCE we have begun to apply those techniques there as well. I could (and should) do an entire post on the details, but if you browse the codebase and see imports from weird-sounding libraries like alloy, sugar and boulder that’s us.

The libraries we built had been making their way into TinyMCE in bits and pieces over the last year or two, but their use went into full effect with this release.

UI

That brings me to the third big change and headline feature of TinyMCE 5, the new silver theme and oxide skin. The entire src/ui folder has been deleted, along with the modern theme, and we started fresh. With the blessing and help of TinyMCE lead developer Johan “spocke” Sörlin we built a new API designed to be modern, flexible, and most importantly abstracted away from the DOM. Taking inspiration from virtual DOM style libraries this structure serves two high level goals:

  1. Custom dialogs fit into the new TinyMCE style without any effort, including the multitude of UI variants the new skin makes possible.
  2. Implementation details of components can be changed without breaking the API. This was a very clear goal to not replicate the “everything and the kitchen sink” approach of previous TinyMCE interfaces.

This breaking API change is the one that will be most obvious for developers migrating to TinyMCE 5, and also the most contentious. We have already received feedback during the pre-release cycle that the new API does not replicate v4 functionality in areas that are important to our developers. We hope to address these needs in the near future but we will be doing our best to not compromise either goal.

Project Structure

The final change I’d like to talk about is one that we are working on but isn’t yet ready. In the last few years it has became clear to the Editor Platform team that while our split library approach allowed for excellent separation of concerns, it was frustrating to develop with when changes were required across multiple libraries. This manifested in a significant way during TinyMCE 5 development, is a common story as the number of libraries in a project expands, and the common answer to these problems is a monorepo. It isn’t a simple change, although it appears to be easier than it used to be, I’m keeping notes and plan to do a detailed write-up of the process.

So the shake up still in progress is that at some point soon the TinyMCE repository will most likely be converted to a monorepo – everything there today will move into a subfolder, and the entire 7000 commit history of the libraries that are used in TinyMCE 5 will be merged into the master branch under the same subfolder. How that looks and when it happens is still yet to be determined, but initial signs are that it will ease these pain points for us.

Takeaways

I hope this change in direction for TinyMCE is well received by the community. We’re trying something new, initial feedback on version 5 has been positive and I’m confident the changes are for the best. We are listening, however, so if you think we’ve done something wrong or not well enough feedback is greatly appreciated.

A fun morning on twitter

I’ve had a John Carmack quote at the top of my CV pretty much since I started taking it seriously (which was actually after I got my first job with Ephox who I still work for).

This morning, I mentioned it to him in response to a speech he gave recently:

And well… the stats on my tweet tell most of the story, but a picture tells a thousand words:

john carmack retweet.png

Today is a good day.

Why OCaml, why now?

OCaml first hit my radar in November 2013. I had just learnt SML, a similar but older language, in the excellent Programming Languages Coursera course. Dan Grossman is one of the best lecturers I’ve ever seen, I found his explanations hit all the right notes and made learning easy. The simplicity of the SML syntax, and the power of the language while still producing code that is readable with minimal training immediately appealed to me.

Over the last 3 years I have tried, and failed, to learn Haskell. The combination of minimalist syntax, pure functional programming style and lazy evaluation is like a 3-hit sucker punch that is very hard to grasp all at once. Having now learnt SML and OCaml, which like Haskell are based on the ML language, that has changed. I have yet to put any more effort into learning Haskell, but it is now clear to me that the syntax is only a small leap from ML and the pure functional style has similarities to SML.

I still don’t want to write production code in Haskell, but the fact that I find it less scary than I used to indicates I have made a significant jump in my knowledge and, arguably, career in the last 6 months.

Dynamic typing

Before I go any further, I need fans of dynamic typing to exit the room. My 12 years in the industry have set my camp firmly on the static typing side of the fence, and discussions about static vs dynamic will not be productive or welcome here.

So, why OCaml?

Smarter people than me have written about this, but I’ll give it a shot.

I have found OCaml to be a refreshing change of pace. Most of my favourite things are derived from the ML base language; variants, records, and pattern matching combine to create elegantly powerful code that is still easy to follow (unlike most Haskell code I’ve seen).

Ocaml takes the expression-based ML style and incorporates enough imperative features to make it comfortable for someone learning Functional Programming. Don’t know how to use recursion to solve a problem? Drop into a for loop in the middle of your expression. Need some debug output? Add it right there with a semicolon to sequence expressions.

Throw in almost perfect static type inference, a compiler that gives useful error messages and immutable-by-default variables and I just can’t get enough. I won’t sit here and list every feature of the language, but hopefully that piques your interest as much as it did mine 😉

Industry acceptance

There is always an element of “I have a hammer, everything looks like a nail” when learning a new language but the evidence that OCaml is becoming more widely accepted is not hard to find.

In the middle of February, Thomas Leonard’s OCaml: what you gain post made waves; the reddit and hackernews discussions are fascinating. A lot of people using OCaml in the industry came out of the woodwork for that one. I’m still working my way through the series of 11 posts Thomas made, dating back to June 2013, about his process of converting a large Python codebase to OCaml.

Facebook have a fairly extensive OCaml codebase (more details below).

It doesn’t take much googling to find presentations by Skydeck in 2010 (they wrote ocamljs, the first ocaml to JS compiler) or a 2006 talk describing why OCaml is worth learning after Haskell.

OCamlPro appear to be seeing good business out of OCaml, and they have an excellent browser-based OCaml tutorial (developed using, of course, js_of_ocaml).

No list of OCaml developers would be complete without mentioning the immense amount of code at Jane Street.

There are plenty of other success stories.

The elephant in the room

The first question I usually get when I tell a Functional Programming guru that I’m learning OCaml is “Why not Haskell?”. It’s a fair enough question. Haskell can do a ton more than OCaml can, and there are only one or two things OCaml can do that Haskell can’t (I don’t know the details exactly, I would think it was zero). I see a lot of references to OCaml being a gateway drug for Haskell.

The answer is JavaScript. As much as I hate the language, JS is the only realistic way to write web apps. Included in the many and varied AltJS languages, both OCaml and Haskell can be compiled to JavaScript but the Haskell compilers aren’t mature enough yet (and I’m not convinced lazy evaluation in JavaScript will have good performance).

In fact, some study has revealed OCaml may be the most mature AltJS compiler of all by virtue of support for existing OCaml libraries.

JavaScript

Late last year I started hearing about OCaml at Facebook. Their pfff tool, which is a serious OCaml codebase all by itself, is already open source – but there was talk of an even larger project using js_of_ocaml (the link seems to be offline, try the video). That presentation by Julien Verlaguet is almost identical to the one he gave at YOW! 2013 and it really grabbed my attention. (Hopefully the YOW! video is online soon, as it’ll be better quality).

To cut a long story short, Facebook created a new language (Hack, a statically typed PHP variant) and wrote the compiler in OCaml. They then use js_of_ocaml to compile their entire type checker into JavaScript, as the basis of a web IDE (@19 minutes in the video) along the lines of cloud9. Due to the use of OCaml for everything, this IDE has client-side code completion and error checking. It’s pretty amazing.

Maturity of tools and js_of_ocaml

The more I dive into OCaml, and specifically js_of_ocaml, the more it amazes me how the maturity of the tools and information reached suitability for production use just as I need them.

  • The package manager OPAM is now a little over 12 months old and every library I’ve looked at is available on it. Wide community acceptance of a good package manager is a huge plus.

  • The Real World OCaml book was released in November and is an excellent read. The book is so close to the cutting edge they had features added to September’s 4.01.0 compiler release for them 🙂

  • OCaml Labs has been around for 12 months, and they’re helping to move the OCaml community forward into practical applications (see the 2013 summary).

  • Ocsigen are investing heavily in js_of_ocaml (among other things) with the next release including an improved optimiser (I can attest to the fact that it’s awesome) and support for FRP through the React library.

Moving forward

Is it perfect? No. Software development is not a one-size-fits-all industry. There are as many articles cursing the limitations of OCaml as there are singing its praises. But in the current market, and with the size of JavaScript applications we are starting to generate, I believe OCaml has a bright future.

Enabling 5.1 surround sound in OS X

Close to 18 months ago, when I first started seriously using that old mac laptop, I decided I needed a way to easily transfer my speakers between the desktop games machine and my mac that I used for everything else. One of my mates at work had an Audigy 2 NX, and after borrowing it for a day to make sure it worked on macs I decided to get one. It wasn’t until I had it that I realised the mac was only giving me 2 channels instead of 5.1 😦

I shrugged and chalked this up to the built-in mac drivers, it was fine under windows with the official creative drivers.

And so it was that when I upgraded to the mac mini, and again with this second mini, that I was stuck with a sound card that wasn’t giving me surround. Most of the time this doesn’t concern me as I usually only listen to stereo sources, but I’d never even considered that it might work (the few references I could find to this device on the net were it only working in stereo on the mac).

Until tonight.

While doing some research for a friend who was interested in USB sound cards, I saw a product review stating that the Zalman USB card does work on macs in full 5.1 surround mode. This piqued my interest so I went searching and stumbled on a list of working sound cards forum post. Right there at the top is the Zalman card, but hang on, what’s that sitting at the bottom under supported 7.1 cards? Why it’s my damn Audigy 2 NX! WTF!

I immediately (and stupidly) installed the package attached to that post, but thankfully I read a bit further down the post before rebooting and realised I didn’t need to. This was a good idea because the package is from 10.4 somewhere and I would almost certainly have been left trying to do a restore from backup. I’ve reverted the kext files that the package installed, hopefully my mac doesn’t die when I reboot it after posting this.

In any case, the answer is Audio MIDI Setup! A program that had always sat in the Utilities folder looking summarily useless but turns out to be the hidden gem that Apple really needs to make more obvious. For those who will no doubt arrive here from google one day, here’s how to enable 5.1 surround sound on a USB sound card:

  1. Select your sound card under the Properties For: dropdown
  2. Select the number of channels under the audio output format
  3. Click Configure Speakers
  4. Select Multichannel
  5. Select the correct number of speakers from the dropdown (only the valid one should be enabled)
  6. You can now assign channels to each speaker, I’m pretty sure the numbers I used are correct although 3/4 and 5/6 might be in the wrong order

Here’s a couple of screenshots with number highlights to make it clear:
Audio Midi Setup
Audio Midi Speaker Setup

Maybe it’s just this sound card, but that’s a ridiculous requirement to get 5.1 surround sound working (and I haven’t actually tested if DVDs will play correctly, only some 6 channel test wavs I found). Wish me luck! 😉

On the plus side, if this does work I will no longer have to worry about surround sound output from my media centre when I buy proper home theatre speakers (the audigy has optical and spdif out). I had been concerned that I would be stuck with stereo output from my Mac forever!

This new google sync has a slight flaw

The new Google mobile ActiveSync is working great for my calendars. Syncing iCal to google was pretty easy; I exported my 3 local calendars, cleared out the main Google calendar & created 2 new ones (naming my primary calendar “work” thanks to the stupid Outlook plugin), subscribed to all 3 via CalDav with Calaboration and then imported the data. No worries at all. I can create an event in iCal and 10 seconds later it appears on my phone 😀

There was a bit of confusion and duplication after syncing my Outlook calendar at work to Google (Did I mention the plugin’s main-calendar-only restriction is REALLY annoying? How about it’s complete inability to detect duplicates?) but that was pretty easy to clear up.

What I haven’t done is turn on address book syncing with the phone. As I suspected and others have confirmed, turning on ActiveSync for contacts & calendar stops iTunes from doing any sync work with them for the iPhone. Which, since iTunes initiates the contact sync to Google, means that contacts are no longer synced to my desktop.

Both forum posts I’ve just linked to have suggested fixes (particularly if you expand them beyond the accepted answer), but I can see three options personally to sync my contacts:

  • Resurrect the fancy iSync scheduling that I haven’t used since switching to the iPhone (I still use the scheduler, just for some Address Book hackery instead of activating the iSync menu)
  • Don’t drop Plaxo completely as I had planned, but use it to sync between google’s contacts and Address Book
  • Leave over-the-air contact sync disabled and continue with iTunes to Google contact sync

So far option 3 sounds the easiest to me. I don’t need to sync my contacts more than once a day (which is how often I sync with iTunes), over-the-air sync wouldn’t give me all of my contact numbers on the phone anyway, and this way I can completely disconnect from Plaxo.

Not that Plaxo is bad – I’ve really enjoyed the service, including a far better Outlook calendar sync platform than Google’s Outlook plugin provides – but ever since I switched my email to Google I have only used it for the Outlook sync (hotmail contact sync is enabled but I don’t need it anymore). It just doesn’t make sense to continue using it in light of Google’s improved Mac/iPhone sync options.