Suddenly my MacBookPro becomes unresponsive, tempted to fling it out the window, and curse Apple. But just wait, what’s actually going on?

This article was actually written just before new year. So the tone may reflect that. But I only managed to add a small bit then publish it now.

A recent craze on my Twitter timeline was triggered by someone’s rather ill-advised tweet (no direct link, hoping to spare the poor lad). The guy has CEO as title in his Twitter profile. He tweeted:

The best software developers I know are always hacking over the holidays.

The twitterati started mocking him, sometimes in creative ways. Hopefully the author’s learned his lesson.

Anyway, that aside, I did take a break, turned off my computer for a few days so that I could dedicate most of my attention to my little family. Then, just today, it happened. I turned on my Mac, it became totally unresponsive. I wasn’t sure what I did wrong. I only remembered recently updating the CUDA driver and Parallels Desktop, but nothing else. A strange dialog popped up, giving me the shudder:

your system has run out of applicaition memory

What was up with this? How come, a system that run smoothly for two years, would suddenly behave like this? Puzzling anyone, except perhaps those versed with the inner-workings of MacOS – I’m not one. At this point it would be reasonable to consider a system reinstall, would cost a day’s effort.

Too long, Don’t read [TL; DR]

Two processes showing high CPU usage, Dropbox app for Mac, a called bird which seems to stand in for iCloud daemon. iCloud also showing memory leaks in the system log. Additionally, Docker for Mac Beta, Parallels Desktop, Parallels Access and Veertu are all installed. All of these installed kernel extensions, something in that mix was apparently causing havoc. I couldn’t tell which one. After a long quest, I resorted to 1.) Backup my Mac, 2.) create a bootable USB stick and reinstall the OS. Only then it was back to normal operation. Without losing any files, or buying a new computer, I was able to fully recover my system.

Now the whole story

I took a deep sigh, started investigating. I used the troubleshooting tips that I know, checked the disk with Disk Utility, it found no errors. Uninstalled CUDA, removed items from the LaunchDaemon and LaunchAgent lists. To no avail, nothing helped.

Boot MacOs in Safe Mode

I started googling at this point. I eventually found a suggestion to restart the computer in Safe Mode. I wasn’t sure why that should help, but I tried nonetheless. I followed the procedure:

  1. Restart the Mac
  2. Immediately press and hold the Shift button.
  3. It took so long that I actually put something heavy on the button to keep it depressed and used my iPad while waiting. Later I woul find out it wasn’t necessary to hold the key depressed. I didn’t time it, could’ve been anything between 3 to 5 minutes, but eventually …
  4. When the login screen showed, I released the SHIFT button, then restarted the computer.
  5. The problem seem to be resolved. But I was going to find out soon enough.

Then something dawned on me: Caching screwed up my machine, again!

For a little while I used my computer, then it started to slow down steadily to a crawl once more. It became totally unusable. It got so bad that something seemed to be turning the camera on without user intervention! I checked, didn’t see Photo Booth or FaceTime running, what the heck was turning the camera on then? Now I really started to worry that some malware has crept in, ruining my system.

Chasing malware (or ghosts)

Now thinking that my computer was malware infected, I decided to investigate. First, I covered the camera with a tape, started doing packet capture and analysis with Wireshark. This is a tedious task, it’s forensic work, it takes forever and you don’t know upfront if you’ll get anywhere! I briefly had a go, sampled various connections, scanned and tracked them. I found nothing, not sure how the camera how was triggered anyway. After about an hour of prodding here and there, I stopped and decided to should try something else.

The system became slow to the point that nothing worked. Activity Monitor was stalling with blank screens. Finder wasn’t responding. Trying to launch a terminal session wasn’t working, I would invoke the system search box and it would take 2 minutes to show the search input box! I would try to type something, one character shows, not the rest, I stop typing wait and after much delay, a couple of characters show and then again no response. In short, all the signs of an absolutely hosed computer. My trusted workhorse was just letting me down, without a warning, and all the usual troubleshooting tools keep showing me green, a clean bill of health for my computer. What the heck is going on?

Getting angry

At this point my patience run out, starting to get angry. I have no idea how I managed to trip the system up in this way. I felt I wasn’t actually doing anything special or tricky, just running Parallels Desktop and getting my Windows VM updated. Other than that, nothing else. I can’t remember such a terrible experience with a Mac since the days before I got rid of Adobe Flash, a few months before the famous thoughts on flash letter was published. Simply cursing Apple is now tempting, given the perplexing decline in quality at Apple. But I thought better of it, I wasn’t just going to fall for a cargo cult. I had to fix this, it’d already cost much time but I had to find out.

Boot in recovery mode

I rebooted it in recovery mode, intending to restore from a backup. Then I found out, rather was reminded that I had explicitly excluded the OS from my backups. Had I resorted to the online restore or other means, I would be going back to an ancient version, this means OSX Maverick! No way!

I left it alone, went back to socialise some more.

A few hours later, I returned, not sure if it’d be worthwhile or not. I run a systems diagnostic test, boot with ⌘ + D. The diagnostic completed signalling no errors or problems at all! This isn’t making any sense at all, no problems or errors anywhere, yet my system has suddenly become unusable, how is this possible ?!?

I had to look in another direction now, perhaps I should have done right from the start. I booted it up in recovery mode again, this time I opened up a terminal session to examine the system differently. I had a hunch that perhaps some software has setup a kernel extension hijacking my system.

Kernel extensions misbehaving?

On terminal, I looked for “.kext” files in the system Library. There were a bunch of them. I couldn’t recognise most, logical since I’m mostly just a user. However, I was beginning to doubt if Docker for Mac or Dropbox was causing the issue. The reason for thinking that is because, I know they are architected to directly hook into the system, listen to low level events and catch some of those on the fly. Dropbox is a particularly aggressive one, I had to remember to turn it off before ever upgrading Xcode for example. Whenever I fail to do that, Xcode installation or upgrade would take forever.

So I had my potential culprit. I deleted every kernel extension associated with Dropbox, Docker for Mac, and Veertu. I also deleted the apps themselves, emptied my account’s Login Items, then restarted the computer. And tah dah!, there it was, back in business again, all responsive and smooth! Phew! This cost me hours I hadn’t planned. I decided to write about it, use it normally for a couple of days before reaching any definite (temporary) conclusion.

Are we there yet?

After all that effort, I thought I was safe. I was wrong. After a reboot, the system performance would slowly degrade. Before it became totally unusable, I would reboot it, gain some relief only to go back in the same state in less than an hour of normal use. I was baffled. Alas, having tired of trial and error, I decided it was time to call Apple Support. As it happen, my standard Apple Care support  had expired, so I had to purchase a one time € 29,- support call. I decided to go for that.

I called Apple, the friendly voice at the other hand took me to the usual drill, paths that  I’d already exhausted earlier. We went through reboots, diagnostic, SMC Reset and all that, to no avail. Apple Support suggested that I reinstall the OS from the recovery mode session, and call them back should that fail. I hung up, took a break. I came back and tried to reinstall the OS from the recovery mode. I booted in recovery mode, tried to reinstall the OS, it would start for a little while then fail. I tried this several times, no success. I eventually gave up for the day.

The next day, I called Apple Support again. Apple Support now suggested a remote session, I were to download and install a program. That process also stalled, I got a zero length file that wouldn’t install. Apple Support now suggested that I book an appointment with Genius Bar. I reluctantly thought I’d try, though I doubted it would help. I called Genius Bar, they were super busy, the next free slot was a full week away. I couldn’t be without my laptop for that long, I decided against this. I had to figure it out by myself, buying another computer seemed like a real possibility now. But which one, given that it was allegedly ‘no longer for professionals‘ ?  😉

I went to my Mac Mini, an mid-2011 model, still working fine though noticeably slower. I checked the time stamp of my last successful Time Machine backup, it was one day earlier, in the interim I hadn’t created anything new on my computer. So I could afford to rebuild it. Using the Mac Mini, I launched App Store app to download a fresh copy of macOS Sierra. When it finished, I created a bootable USB with it. I tried to boot my MBP with it, that didn’t work. So I booted it up normally, logged in as one of my spare admin user, then mounted the USB stick and launched the installer app. The first attempt failed, but the second attempt worked. Finally, I could let it go through the long running installation process to its completion.

Once the OS re-installed successfully, I rebooted the machine and logged in with my regular account. I found that everything was where I left it, nothing lost, nothing broken, and my computer was back to its shiny best! I had expected to have to restore from backup, I didn’t have to. And since, for a long while, I took the habit of spreading my work files between iCloud and  Dropbox, I was sure my work documents were available, intact.

Conclusions

I am not sure what triggered the problem, but it was really startling and irritating to see my computer suddenly become unresponsive. I still haven’t figure out how the camera seemingly turned itself on. Was I hacked? Is there some command line tool or keystroke sequence to start Camera without using FaceTime and Photo Booth or some kind of camera enabled app? I don’t know, it is not reassuring at all. It might as well have been a malware, I had no patience to do the forensic work, rebuilding it has got me rid of any troublesome bit there was. As I was contemplating this issue, I saw a tweet in my timeline, on the security aspect. It’s a product I once run into but might now give a try.

When our computers become unstable, we are often quick to blame the vendor. This behaviour is even more acute when it’s open season for bashing a vendor. By this, I mean the cargo cult habit of thrashing a company for it’s alleged failures. Whether justified or not, that’s how people typically behave. I’ve linked a few articles above on this subject. You can’t always blame people, it’s frustration due to feeling powerless and a sense that you’re getting a service below par. Yes, some are definitely out for grabbing headlines, juicy click-baits, I-was-first and what not, kind of pursuit. In my particular situation, the system was healthy with regular apps running. Having installed a bunch of stuff over time, I eventually reached a point where some system extension caused trouble. So, once trouble hit, backtracking some of what I did, I was able to eventually recover my system. It did take a lot of time, as I hope I’ve detailed enough. Does this say something about Apple? I am not so sure. Years ago, when I only used Windows PCs and laptops, I also reached such situations at times. Does this say something about software reliability in relation to extensibility? Absolutely!

It’s often a question of trade-offs. I could have pursued the path of getting external support, it would have cost me time, money and I’m not even sure the fix wouldn’t have unnecessarily resulted in even more expenditure. By doing it myself, it cost me a lot of personal productivity and leisure time, but I didn’t spend any extra money to solve it. Maybe I was just lucky that there was no hidden hardware or other serious issue.

It could have been a hardware issue, though the diagnostic tools didn’t report any problems, I have to rule that out. So, yes it’s possible that Apple software contains some annoying bugs, every software has those. Maybe I was just unfortunate to have hit one. The same thing is possible with any number of third-party software I have installed over time. It is also possible that I was infected by some malware. Whatever the case, in this particularly situation, having found the resources to troubleshoot my problem and get back to a normal operation, it would be harsh to only blame Apple.

Mac, Windows or other Unix/Linux are all susceptible to get corrupted and become unstable eventually, as you install more and more software on it. We, the users, most often find ourselves in the latter situation. The non-technical user, the user in a rush, often don’t have the resources to fully recover their machines in such situations.

If you’ve hang around this far, I thank you for your patience and I truly hope this tale could help save you some time or frustration or even unnecessary expenditure, some day.

Fixing Apple Calendar sync problems on MacOs, a tale of caching gone wrong

Ever seen an error message like this before?

Apple Calendar for MacOs sync error
Apple Calendar for MacOs sync error

If so, could you figure out, as a regular user, what was wrong?

Well, I have run into the problem. In fact many times in the past, I usually ignored it, would close Calendar app and not rely on it. But recently, I thought I’d look into it. I found out why and how to fix it. My first place of call was Google Search, leading to StackOverflow, some Apple articles, all quite frankly misleading, but they put me on the path to a solution.

Such error would turn off every user who isn’t technical, and the fix is likewise a turn-off. Here is how to fix this problem. I assume that the readers might not all be experienced Mac users or very technical, hence I include more detailed steps.

1.- Quit Calendar, to prevent it from crashing. Simply close the app, sometimes it gets stubborn and you have to force-quit it.

2.- Stop Calendar Agent service, it runs in the background. Here is how.

  • Start the Activity Monitor app. I always use ⌘ + Space to find and start programs. Press ⌘ + Space, type Activity Monitor then ENTER.
  • Search for: CalendarAgent, just start typing the word. when it’s highlighted, click on the Force Quit button as indicated. See the illustration. activity monitor

3.- Now it’s time to open Terminal.

Press ⌘ + Space, type Terminal then ENTER, to start the app. Be cautious in this step, don’t accidentally delete anything. If in doubt, simply move files to your Desktop for example. First, let’s see what files are there by listing them

$ ls -lpatrh ~/Library/Calendars/

You’ll see something like this:

list-of-cached-calendar-files

This is macOS caching calendar events, locally on my MacbookPro. Some of these folders contain files that are no longer valid. Apparently, Calendar app wasn’t able to figure it out and is throwing its hands in confusion. With some patience I could go through these folders, find out exactly which ones aren’t good and remove them. But, no, I don’t have time for that. I just get rid of them all and let Calendar re-sync with the servers. In my example, I move the files to Downloads, which can be emptied if everything goes well.

$ mv ~/Library/Calendars/* ~/Downloads/

Now that the cached files are gone, it’s time to start Calendar app again.

4.- Optionally, toggle Calendar OFF then back ON

Since I am unsure how many other places caching may be occurring, I apply my usual macOs toggle technique. This means that I turn OFF and then ON again, the Calendar feature of my Internet Accounts. To be sure, press ⌘ + Space, type Internet Accounts then ENTER. toggle calendar

5.- Open Calendar app again

When Calendar starts up now, it should find that it needs to download everything from the servers, it would do so, and the error should be gone for good. Phew! What a hassle that was!

This shouldn’t happen, yet it does. This is not user friendly in many ways. For a start, the problem seemingly appeared out of nowhere, in the sense that I am not aware of any user action I specifically took that should result in such outcome. This means that 99% of regular users would find this baffling at best. Second, the error message means nothing because, again, I never entered such a URL anywhere while setting up my Mac. I can see what it is suggesting. It was all auto-magic while configuring my email accounts, so I shouldn’t be asked to fix a URL that I never actually typed.

What was the problem then?

In short, it was a caching problem. The clue can be seen in the suggested buttons. Here it is:

revert to server

This is telling me that a local copy, hence a cache, is not aligned with the server somehow. Looking at the URL, I can’t see anything wrong and all my other devices, iPhone and iPad (Android or others, if that’s what you have), are functioning properly. In earlier attempts, I did what it suggested, the dialog would pop up again and again. I also tried the ignore option on occasions, to the same effect, the dialog keeps popping up. As a user this is very confusing, because you don’t know how many times it would and why it keeps nagging about it – my hunch at this point, is that the dialog would occur for each problematic folder found in the Calendars cache place. Worse, after a reboot and sometimes a re-logon, the same ugly dialog pops up again, repeatedly. This is infuriating, as a user, it seems I can’t get this issue behind. If this happens to you, hopefully with the steps above you have resolved it.

I’ve had Calendar working properly many times and over very long stretches without any troubles. So I knew Calendar app can be reliable, I just don’t know what tripped it up and what else I might be missing should I continue ignoring the error. That is what motivated me to look into it eventually.

Summary

One might ask, how would Apple ever allow such a seemingly trivial problem to occur? Why not prevent it by design? I can see a large number of reasoning behind it, why it might not be so easy to solve this kind of problem in a durable way. For example, when a local copy of a calendar event isn’t aligned with the server, it may be because changes on one device didn’t make it to the server or other devices yet. With many devices on the go, determining which one has the fully complete overview might not be a trivial task. People would be furious if they should lose data due to some clever sync algorithms. I could dive into this subject at length, but that wouldn’t fit in this post.

Caching is very handy when it works. Apple uses it profusely, everywhere and in many ways. But when it breaks, lots of time can be lost trying to diagnose and fix it. I don’t have any stats, but caching could be the cause of a proportionally large number of headaches that we routinely face with our computing devices. In this instance, especially for Apple products, it’s really hard to find out what may be going on.

Since Apple doesn’t tell us much about the inner-workings of their software products, a user, however savvy, may not have much clue how to address this kind of problem. This situation causes a lot of the frustration amongst users of Apple products, and particularly the technically savvy ones. I don’t have a clearcut answer. It’s easy to guess that the very large majority of Apple users aren’t technically savvy, hence wouldn’t venture trying to fix technical issues. However, a case could be made to include an Expert Mode, for those who feel like having a go. Maybe include an Insane Mode (borrowing from Tesla), for the really daring folks, even if that would mean voiding warranty or some sort of disclaimer. We should be able to tinker with our toys if we feel like it, there is no need for father-figure for the entire community.

In the department of Too Subtle UX, your Apple ID is the unique passport for iCloud on every device

I was triggered by the following tweet:

I quickly read the article and saw what was going on. Indeed I did notice it, but I knew to expect the kind of behaviour. The good news is, the feature seems to be working correctly: with a single Apple ID, you can login to any device and get access to your desktop and documents thanks to iCloud. The bad news is, most people will easily overlook the single Apple ID part, they’d only think about each device they’re using and would not make the mental link to a single iCloud account. Apple, with their clout, are bound to know this and should be expected to anticipate on it. However, as far as I can tell, people weren’t explicitly forewarned, and they’re probably supposed to be obviously aware of owning a single iCloud account (as in Duh, what else are you thinking?). And that is precisely the problem, this kind of logical behaviour, although intuitive at first sight, doesn’t take into account the long established mindset that we all have. Therefore, while upgrading the OS you wouldn’t think twice before enabling iCloud sync.

In my personal experience, iCloud did the right thing. The first computer I upgraded was my MacBook. My Desktop and Document were all properly sync’ed to iCloud. Then a few days later, I upgraded my Mac Mini, this time iCloud added a new folder on my ubiquitous iCloud backed-up desktop, prefixing the folder name with my Mac Mini’s hostname. In an instant I could recognise what had happened and weren’t surprised. It seems that the gentleman who posted the article didn’t get the same nice experience that I did. I won’t speculate on that particular case, but had Apple said something about this topic loudly and clearly as they geared up for the official launch, people wouldn’t get startled and maybe some would have thought things through before upgrading.

To my mind, this is a perfect illustration of the kind of problem I was referring to when I wrote the following post: When the UX interaction can be too subtle.

Maybe this article won’t go anywhere, many having got a good experience. Or maybe, since the twitterati is always ready for some outrage-tweetstorm, there will be plenty of chatter and not just compliments to go about. We’ll see. Funny enough, I was actually pondering this weekend, the apparently lack of something-gate to do with MacOs and iOS10 in social media. I didn’t have to wait for too long. 😉

The original article is here, exactly as it’s title reads:

maybe be careful with osx sierra”

Four days with my Upgraded MBP to MacOs Sierra, a few minor tweaks, but so far no complaints.

Yes, I could’ve waited a couple more days. But had I wanted to wait a couple of days, then I wouldn’t actually want to upgrade anytime this year. The reason is simple, anything that still breaks in a significant way right now will probably not see a good fix until a few months down the line. This has been my previous experience. That’s why I decided this time round, if something should fail to install or run then I’d look for a Docker container and not bother about it much.

Docker

Yay! It runs without me doing anything at all. I simply let the upgrade run its course and the restart all happen, when that was all done then Docker continued to work as before. Now I can’t recall, I might have downloaded Docker for Mac once again. In any case, I never tend to remember technical things that went to plan. This flawless upgrade gave me the assurance that my fallback plan, running things in containers, is going to work.

Here is what I did to keep it humming.

Scala and Java

There was an issue with TLS, I fixed it by pointing the JVM to the new location of the CA Certs with this setting:

-Djavax.net.ssl.trustStore=/Library/Java/Home/lib/security/cacerts

I did this for both JAVA_OPTS and SBT_OPTS, then I could run Java apps again.

IntelliJ IDEA

If this wouldn’t work then I might seriously consider downgrading. But nope, I didn’t have to downgrade, it worked first time. The one issue I run into was with running a Java app that tried to establish a TLS connection to the outside world. This issue was resolved as per the previous fix. I run all the JetBrains tools that I use, AppCode, IDEA, DataGrip, and so on, no problems with any of them. So This meant that whatever breaks from this point on will be re-homed in Docker and I won’t deal with it.

Ruby

I’m using RVM, as expected it complained and wouldn’t install or re-install a Ruby binary. I checked the version that ships with MacOS Sierra, it was good enough for me, ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-darwin16]. so I settled with that. Here is what I had to get RVM and the ruby stack going:

  • run, to get the command line tools installed (shouldn’t have to, given that I have Xcode installed, but I was at this spot before and didn’t want to spend any time on it:
  • $ xcode-select --install
  • Switched to System Ruby distro
  • $ rvm use system
  • Restored the install gems to pristine, then everything worked from there.
  • $ gem pristine

Haskell Stack

GHC installs fine, but Haskell Stack doesn’t. I run brew install ghc, but I gave up on Haskell Stack as it just wouldn’t succeed. I run out of my budgeted time, so I stopped trying that.

Rust

I got a few misleading error messages from this, but it turns out the real culprit was a prior failed installation via brew. I run brew doctor, looked for and removed all libraries that obviously look related to rust. You can’t miss them. Once I was done with this, I just built Rust from source and installed it. I haven’t used it in any anger as yet, but it seems to work fine now.

GO

Not a single problem. I didn’t even have to touch it, it worked as it always did.

Homebrew

I run brew doctor, to find out what was broken. I found out I had a partially installed Rust version. Rust would no longer install nor uninstall. I solved that the hard way, got rid of all related libraries as reported by brew. After that everything was working fine.

My daily tools

I found out I actually use lots of tools, none of them showed any issues. Omnigraffle, Sketch, Android Studio, Atom, TextMate, Parallels Desktop, Postgresql and a half dozen other command line tools, not a single glitch with any of those.

Forward with Sierra

This is the chance to do a clean slate on one topic that I’ve had in mind for a while. If you follow this blog, you’d remember my statement about containers providing a new chance to compartmentalising software components. With MacOs Sierra, this is my chance to pursue this idea. I will stop installing development components via Homebrew. Instead, I’ll look for Docker containers. This is what I plan to do for everything like DBMS, ElasticSearch, and other cluster native stuff like Riak, Spark, and so on. Basically, it no longer makes sense to install these locally, it’s best to use a container-based cluster manager as that is the only likely use for them.

Closing notes

Please use your own judgment, decide if you want to risk installing a bleeding edge software on your production machine. Remember that Leslie Lamport statement that I often paraphrase, no two computers will likely have the same state. This is important because the slightest difference, and there will be a lot of them always, could result in different behaviours between two seemingly equivalent environments. If nothing else, ensure you have backups that you can verifiably restore your machine from, before attempting this kind of upgrade. If you still go ahead and break your machine, don’t say that I didn’t warn you.

That being out of away now, if you decide to give it a go and it goes well, you might enjoy the experience. Since the upgrade I noticed that I actually gained about 40GB of additional free space. And this is probably thanks to a feature that promise to move rarely used items off of my computer and on to iCloud. Yes, you guessed right, I have an iCloud account.

Qubes OS Project, a secure desktop computing platform

Given that the majority of security annoyances stem from antiquated design considerations, considering the progress made in computing, affordable computing power, this is probably how Operating Systems should now be built and delivered.

Qubes is a security-oriented, open-source operating system for personal computers.

Source: Qubes OS Project

OS X Yosemite, why block my view when you should’ve known better?

I now frequently experience several apps freezing for no apparent reason. Standard apps like Finder.app, Preview.app, or Mail.app or Safari.app, would just stop responding. After digging it up a bit, I found out that when Spotlight get into such an aggressive reindexing, Finder.app also stops being responsive. My conclusion, from here, was that, some of the standard apps that ship with OS X Yosemite contain a certain amount of code that are very old or simply badly designed. Such code, typically the work of UI framework enthusiasts or design principles of another era, would traverse a UI layer stack for tasks like disk and network access, although they shouldn’t have to.

I should have made this a series, another one on OS X annoyances. I now frequently experience several apps freezing for no apparent reason. Yet again, a new behaviour that until now, 8 years after switching from Windows to Mac, I didn’t expect to experience. Standard apps like Finder.app, Preview.app, or Mail.app or Safari.app, would just stop responding.

safari_preview_finderNormally, if an app stops responding then this will show in Console.app. In these instances, Console.app was showing a clean health situation, nothing is stuck. But as a user, I could type any number of keys and move the mouse around, Finder.app doesn’t respond, Spotlight doesn’t instantly find any answers – whereas it normally does as you type characters. I use Spotlight to launch apps, so when it doesn’t respond then that interrupts my work flow. Then I immediately turned to Alfred.app, and surely enough Alfred was working fine and could carry out any task I usually throw at it. What the heck was going on now?

Screen Shot 2015-05-20 at 22.43.43

I started to guess a deadlock situation, invisible to the regular app monitor. I then looked for what might be hogging up resources and saw something interesting.

dropbox_and_Spotlight_max_out_the_cpu

Two processes are occupying 130% of the CPU, effectively 2 out of 4 CPUs on my machine are fully utilised. I have 2 more CPUs that can potentially do work for me. And they do try, only soon to get stuck. ‘Dropbox’ app is easy to recognise, the second hungry process ‘mds‘  is actually the indexer of Spotlight.

Dropbox was clearly working hard on synchronising files to the Cloud, but what was mds doing? I did recently move around a large number of files, this may have invalidated Spotlight index, and it is trying to rebuild it. All fine, but I always thought that only happened when the machine was not being used. Furthermore, I expected that Spotlight indexer wouldn’t make the UI unresponsive. I was wrong in both cases.

I found out that when Spotlight get into such an aggressive reindexing, Finder.app also stops being responsive. This has some consequences: some apps appear to work fine, I can launch other apps and they may be snappy and all, as long as they don’t go anywhere near Finder.app. The overall impression is that the Mac is unstable without any app appearing to be hanging. How is this possible? Then I remembered what I always chided Windows, the fact that some tasks were unnecessarily channelled via UI layer stack, making them sluggish and prone to get stuck. That’s the same behaviour I was now observing.

force_quit_spotlight_indexer_for_responsiveness

 

To confirm my hypothesis, as soon as I killed the Spotlight indexer, Finder.app, Preview.app an others immediately became responsive again. I repeated the experiment many times over before writing this post.

I found another sure way to get Preview.app stuck, any attempt to rename a file, move it to a new location, or add tags to it directly from Preview.app menu, will cause both Preview.app and Finder.app to become unresponsive for a long time.

Screen Shot 2015-05-20 at 22.45.27

 

My conclusion, from here, was that, some of the standard apps that ship with OS X Yosemite contain a certain amount of code that are very old or simply badly designed. Such code, typically the work of UI framework enthusiasts or design principles of another era, would traverse a UI layer stack for tasks like disk and network access, although they shouldn’t have to.

Most users would typically get frustrated and decide that OS X is just bad software, others might think about rebuilding their machine. I just looked briefly into it, didn’t bother digging up too much into the SDKs, APIs and other kernel debugging tricks to get to the true bottom of it.

 

 

Leadership drive: From ‘despises Open Source’ To ‘inspired by Open Source’, Microsoft’s journey

With a change of mind at the top leadership level, Microsoft showed that even a very large company is able to turn around and adopt a customer focused approach to running a business. By announcing Nano Server, essentially a micro-kernel architecture, Microsoft is truly joining the large scale technology players in a fundamental way.

A video on Microsoft’s Channel9 gives a great illustration of the way Microsoft is morphing its business to become a true champion of open source. I took some time to pick some of the important bits and go over them.

I remember the time when Microsoft was actually the darling of developers, the open source equivalent of the day as I entered this profession. I was a young student, eager to learn but couldn’t afford to buy any of the really cool stuff. Microsoft technology was the main thing I could lie my hands on, Linux wasn’t anywhere yet, I had learned Unix, TCP/IP, networking, and I loved all of that. Microsoft had the technical talent and a good vision, but they baked everything into Windows, both desktop and server, when they could have evolved Ms-DOS properly as a headless kernel that would get windowing and other things stacked upon it. They never did, until now. The biggest fundamental enabler was probably just a change in the leadership mindset.

The video presents Nano Server, described as a Cloud Optimized Windows Server for Developers. On a single diagram, Microsoft shows how they’ve evolved Windows Servers.

Microsoft Windows Server Journey
Microsoft Windows Server Journey

Considering this diagram from left to right, it is clear that Microsoft became increasingly aware of the need to strip out the GUI from unattended services for an improved experience. That’s refreshing, but to me, it has always been mind-boggling that they didn’t do this many years ago.

Things could have been different

In fact, back in mid-90’s, when I had completed my deep dives into Windows NT systems architecture and technology, I was a bit disappointed to see that everything was tied up to the GUI. Back then, I wanted a Unix-like architecture, an architecture that was available even before I knew anything about computers. I wanted the ability to pipe one command’s output into the input of another command. Software that requires a human present and clicking on buttons should only be present on the desktop, not on the server. With Microsoft, there was always some windows popping up and buttons to be clicked. I couldn’t see a benefit to the user (systems administrators), in the way Microsoft had built its server solutions. It wasn’t a surprise that Linux continued to spread, scale and adapt to Cloud work loads and scenarios, while Windows was mainly confined to corporate and SMB environments. I use the term confined to contrast the growth in departmental IT with Internet companies, the latter having mushroomed tremendously over last decade. So, where the serious growth is, Microsoft technology was being relegated.

Times change

When deploying server solutions mainly covered collaboration services and some departmental applications needs, people managed a few number of servers. The task could be overseen and manned by a few people, although in practice IT departments became larger and larger. Adding more memory and storage capacity was the most common way of scaling deployments. Although, still rather inconvenient, software came in CD-ROMS and someone had to physically go and sit a console to install and manage applications. This is still the case for lots of companies. In these scenari, physical sever hardware are managed a bit like buildings, they have well known names, locations and functions, administrators care discriminately for the servers. The jargon term associated to this is server as pet. With the best effort, Data Centre resource utilisation remained low (the typical figure is 15% utilisation) compared to the available was high and large.

Increasingly however, companies grew aware of the gain in operations and scalability when adopting cloud scaling techniques. Such scaling techniques, popularised by large Internet companies such as Google, Facebook, Netflix, and many others, mandate that servers are commodities that are expected to crash and can be easily replaced. It doesn’t matter what a server is called, workloads can be distributed and deployed anywhere, and relocated on any available servers. Failing servers are simply replaced, and mostly without any downtime. The jargon term associated to this approach is server as cattle, implying they exist in large numbers, are anonymous and disposable. In this new world, Microsoft would have always struggled for relevance because, until recently with Azure and newer offerings, their technology just wouldn’t fit.

the voice of the customer
the voice of the customer

So, Microsoft now really needed to rethink their server offerings, with a focus on the customer. This is customer-first, driven by user demands, a technology pull, instead of the classical model which was a technology-first, I build it and they will come, a push model, in which the customer needs come after many other considerations. In this reimagined architecture, the GUI is no longer baked into everything, instead it’s an optional element. You can bet that Microsoft had heard these same complaints from legions of open source and Linux advocates many times over.

Additionally, managing servers required to either sit in front of the machines, or firing up remote desktop sessions so that you could happily go on clicking all day. This is something that Microsoft appear to be addressing now, although in the demos I did see some authentication windows popping up. But, to be fair, this was about an early preview, I don’t think they even have a logo yet. So I would expect that when Nano Server eventually ships, authentication would no longer require popup windows. 😉

the new server application model
the new server model, the GUI is no longer baked in, it can be skipped.

The rise of containers

Over last couple of years, the surge in container technologies really helped to bring home the message that the days of bloated servers were numbered. This is the time when servers-as-cattle takes hold, where it’s more about workload balancing and distribution rather than servers dedicated to application tiers. Microsoft got the message too.

microsoft nano server solution
microsoft nano server solution

I have long held the view that Microsoft only needed to rebuild a few key components in order to come up with a decent headless version of their technology. I often joked that only common controls needed rewriting, but I had no doubt that it was more to do with a political decision. Looking at the next slide, I wasn’t too far off.

reverse forwarders
Reverse forwarders, a technical term to mean that these are now decent headless components

Now, with Nano Server, Microsoft joins the Linux and Unix container movement in a proper way. You see that Microsoft takes container technologies very seriously, they’ve embedded it into their Azure portfolio, Microsoft support for Docker container technologies.

Although this is a laudable effort, that should bear fruits in time, I still see that there is a long way to go before users, all types of users, become truly centre for technology vendors. For example, Desktop systems must still be architected properly to save hours of nonsense. There is no reason why a micro-kernel like Nano Server wouldn’t be the foundation for desktop systems too. Mind you, even with multi-core machines with tons of memory and storage, you still get totally unresponsive desktops when one application hogs up everything. This shouldn’t be allowed to ever happen, user should always be able to preempt his/her system and get immediate feedback. That’s how computing experience for people should be. It’s not there yet, it’s not likely to happen soon, but there is good progress, partially forced by the success coming from free and open source advocacy.

If you want to get a feel for how radical Microsoft has changed their philosophy, and you are a technical minded person, this video is the best I’ve seen so far. You will see that the stuff is new and being built as they spoke, but listen carefully to everything being said, watch the demos, you will recognise many things that come straight from the free open source and other popular technology practices: continuous integration, continuous delivery, pets vs. cattle, open source, etc. I didn’t hear them say if Nano Server would be developed in the open too, but that would have been interesting. Nano Server, cloud optimised Windows server for developers.

OS X Yosemite adaptive networking, a blessing that’s been a curse for my MacBook lately

When my computer detects several known (or eligible) networks that it can connect to, networking becomes unstable without the system ever showing any errors. I resorted to forcing only one network, to regain stability.

I experienced quite some frustrations when my computer networking becomes unstable without ever notifying me of any problems. After some trial & error, if found out that the problem occurs whenever I happen to have more than one eligible network within reach. In the end, I had to manually enforce some fixed connections to regain a decent stability.

Apple introduced a neat feature in OS X Yosemite, the computer can automatically switch to the best network it can find without interrupting your programs. They also introduced another feature, OS X randomises the mac address (if useful, the physical network adapter) which should make the computer a little more secure. These have been hailed and are quite useful updates. The first one comes in handy if a network connection would suddenly drop, but the computer is able to reestablish it or hop on another network, for example while downloading a large file, you’d appreciate that the download would just progress to the end, and not get restarted from scratch. That’s a nice time saver. The second feature, the mac address randomiser, helps to prevent that for example coffeeshop wi-fi routers (or malware) could identify someone’s machine. I believe I’ve been at the stick end of these features lately however. After several updates and trying various things, I’ve come to the conclusion that these features are up against me.

I only use Wi-Fi on my laptops for years already. Over the past weeks, I’ve been getting a torrid networking experience whereby my computer would intermittently lose network connection without notifying me of any problems. I’d check and see full strength Wi-Fi signal and that I am still connected to the network, yet none of the applications that I am running is able to reach the outside world on the Internet.  Without me doing anything, the Internet connection comes alive again and I can do a few things, then it would drop once again. The pattern repeats many times. At first I didn’t think much of it, but quickly grew annoyed and set out to resolve the issue. After scouring forums, stack overflows and other random sources, in vain, I was on my own. What eventually brought stability back to my networking was this:

  • Let the computer look for Wi-Fi network
  • Once it establishes a connection, turn off the automatic network detection
  • Delete any other known network in my vicinity, that was already registered by my computer.
MBP OSX Yosemite don't ask for networks
MBP OSX Yosemite don’t ask for networks

After I’ve done this, I get a stable connection. But at home, it turns out I have one more complication. My UPC (Ziggo) subscription includes a wi-fi router, but I also have Apple Airport Express. When both are activated, my computer detects two known wi-fi  networks, so it will start hoping back and forth between the two without telling me. So I end up in the same situation, can’t get on the Internet and for no apparent reason. To resolve this, I’ve turned off the Ziggo wi-fi router.

What I think may be happening, is the following. The computer detects a Wi-Fi network, requests and obtains an address, thus can get on the network. But shortly afterwards, it detects another network with perhaps a slight but (intermittent) better signal strength, it would hop on to that new one. Wi-Fi being a radio signal, the vacillations of the two (or more) signals cause the computer to keep jumping around. When this is combined with the randomised mac address (physical address) allocation, the Wi-Fi routers would then temporarily quarantine the computer before allowing it back in. The user (me!) then experiences that the computer simply loses any network connectivity for reasonably long spells of time, 3 to 5 minutes, then inexplicably regain it again. It would then loops the back into the same game. On and on. This is what I think has been going on with my MBP, and that’s why I decided to try forcing a semi-manual network setup, so essentially try and stop it being too clever.

Adobe Slate, an attractive tool for publishing nicely laid out content

It should have been possible to easily author and publish polished web content with word processors. It wasn’t, and still largely isn’t. A recently introduced product, Adobe Slate, seem to have solved this for iPad users.

It is time to revise a past blog post where I talked about missed opportunity for word processors. With Adobe Slate, it is fair to say that iPad users can now easily publish nicely laid out content.  I tried it briefly, it’s effortless to start with, all you need is some content. Well, I think this should have always been possible with the word processors that have been on the market all this time.

There is one drawback to Adobe Slate though, it requires an account with Adobe Cloud. I understand the rationale, but this, to me, is an unnecessary barrier to adoption.

Adobe Slate web site.

 

Handy shortcut to keep WiFi running on OS Yosemite: restart the DNS resolver

A handy shortcut to help keep WiFi running on MacBook Pro with OS X Yosemite. The DNS resolver appear to be problematic with WiFi, it will frequently lose network connection, sometimes it won’t connect for long minutes. By restarting it, most of the time the issue goes away. I made a handy bash script to do this.

A long time ago, while I was studying, I had a PC running Microsoft Windows 2 (yes indeed, Windows version 2). It came with a program called Write, which I was using to type my homework and eventually my graduation assignment. This thing was unstable, it crashed so often that I learned to press CTRL+S at the end of every line of text that I typed on it, to be sure that I didn’t lose my work. The habit never left me. It wasn’t until about 4 or 5 years ago, long after I had already switched to Mac and didn’t need to worry about CTRL+S, that I finally lost the habit of instinctively hitting that key combination every few minutes.

I have an annoying issue with my Mac, it just randomly loses network connection, sometimes it won’t connect at all for  a few minutes. After a couple of updates that promised to have fixed the issue, it’s still there. I made this shortcut, very short bash scripts that I placed in my .bash_profile startup script.

$ alias down-discoveryd='sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.discoveryd.plist'
$ alias up-discoveryd='sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.discoveryd.plist'
$ alias restart-network='down-discoveryd && sleep 3 && up-discoveryd'

One line would have been enough, but I wanted it a little pretty, so I made three. I added a small delay, for good measure, though I think it could also be omitted. The only command I need to run is the last one, restart-network, I get prompted for the admin password, and the service is restarted. If the network is still not restored, I run it again, and again. After 2 or 3 attempts, I get my network connection back and I can continue working.

I find myself using this shortcut very frequently. It has become my new CTRL+S. Unfortunately.