Alxandr.me

/* TODO: Implement */


The problems with portable class libraries, and the road to solving them

First of all, I'd like to say 2 things. I've opened a github repository over at https://github.com/Alxandr/Blog where you can go register issues if there's anything you'd like me to write about. I do .NET and JavaScript, and some other stuff too, so if there's something you think I might know, that you would learn a bit about, send me an issue over there.

Secondly, this post is going to be highly technical. I'm going to attempt to go fairly deep into some core parts of the CLR and explain how things work. Also, because this is an advanced topic, and I probably don't know everything I should about it there's a chance I might get something wrong, so if you find something here that's not correct, please send me an email, or add an issue at the github repo stated above.

Now that that's over, let's talk portable class libraries.

Portable Class Libraries

Portable Class Libraries (or PCL for short) is a technology most .NET developers know of, and a lot of us probably have a love-hate relationship with it. PCLs enables developers to use a single codebase, without any modifications and generate assemblies that can be consumed by multiple targets. Example of this is for instance silverlight, or windows phone. But before we get into how PCLs work, lets talk a bit about how one would solve the issue they do before there were PCL.

The ways of before

If we go back a few years, and I wanted to make a library that worked across multiple targets (let's say silverlight, windows phone and the .NET Framework v4.0) you would need 3 different projects. One for each of the target frameworks. This is still done today, mind you. You see projects that have Project.Net40, Project.Silverlight, Project.WindowsPhone, and this approach is probably the most configurable way to make something that runs on multiple platforms. You can, yourself, manage exactly what runs on which platform, and if some things only work on one, you just leave that piece of the code out on the others. However, there is a problem with just doing multiple projects like that, you end up with n times more code (n here being the amount of target platforms you have)! Now, this is obviously unacceptable, because if I change something somewhere, I need to remember to change it everywhere (and being a lazy developer I would obviously sooner or later forget to). So you rarely (if ever) see projects where the sources are just copied between the projects.

What people do though, instead of copying the sources is that they link them. So, you have multiple csproj files, but they all point to the same source files. And if you need to have subtle differences on the different platforms, you resolve that by using #if-defs. However, there are issues with this method too. First of all, you still have to create (in our case) 3 projects. Then you need to configure all of them to have at least one different and unique DEFINE, and lastly, you have to add the linked sources. If this is a ready-made project that doesn't change too much, this isn't too much of a hassle, but if this is a budding project, where you add and remove files every day, you have to manually add and remove the linked files from each of the projects! It's for reasons like this the project linker was created. And though it greatly helped with the hassle of having to manage multiple projects, the hassle was still partially there.

Another problem with this "many assemblies" approach is the fact that dependency-management becomes that much more problematic. Say I made library A, that depended on library B, both of which were available for the same three platforms. Now, I'd have to download 3 copies of B, and put them in my solution folder (or relative to it), then reference the correct B from each of the different projects I made. This quickly get's messy. On top of this, I'd also have to do triple the work any time I were to upgrade B. Otherwise I might end up with different platform builds of my project depending on different versions of B, and imagine what a hassle that would be to whatever project were depending on A.

The ways of today

After having grown tired of managing all these projects, some smart folks at Microsoft grew tired of having to so, so they came up with an idea of how to make things better for everyone. Enter the PCL that we all love and hate. PCLs allows us to create a project and target multiple platforms, then write code once, using a subset of the APIs, one that's available on all the selected platforms, to create portable class libraries. You get one DLL, that can run on multiple platforms. Great, right?

Well, sure, PCLs have a lot of benefits over having to manage multiple projects, however, there are a few quite distinct drawbacks. First of all, when you create a portable class library, you select two or more target platforms from a list of available platforms. In our case (going by the example in the previous section, yet upgrading it for today) that would be net45, silverlight and windows phone. Now, let's say our library is a tiny library that has a single function that goes and finds us a cat image on the internet (simple right?). Something like this:

namespace CatFinder  
{
    public class Cat
    {
        public byte[] GetCatImage()
        {
            // Magic implementation
        }
    }
}

This is quite simple, there's only one function, and it returns a byte-array with the image we requested. Now, as already said, we want to use this on our phone, which can have bad internet capabilities at times, so downloading a cat image might take a while (at least more than a few milliseconds). The same can obviously be true for desktop too, however, it's more probable that it takes time on a mobile. Now, taking time to download our cat image is fine (we can't really do anything about it), but doing so on the main thread is not. If we used this function on the main thread of our application, the main thread would be blocked for the entire time while the image was being downloaded, and the user would experience the application hanging.Now, as we've all been in this game for a while, we all know that the obvious solution to this problem is to add a GetCatImageAsync variant of the GetCatImage-method. But wait, silverlight doesn't support async.

Note: I'm ignoring the fact that there is a NuGet package that provides async for legacy platforms that probably work with silverlight. This post is to discuss the problems with PCLs, and this problem highlights it, even though there exists workarounds for this particular example.

If we were using the old method, with one project per target framework, this would be an easy problem to solve. What we could simply do was to change the code into this:

namespace CatFinder  
{
    public class Cat
    {
        public byte[] GetCatImage()
        {
            // Magic implementation
        }

        #if !SL
        public async Task<byte[]> GetCatImageAsync()
        {
            // Magic implementation
        }
        #endif
    }
}

With this code, silverlight users wouldn't get the goodness of the async variant. However, they could still use our library. But in PCLs we can't do this. Working with PCLs, you get forced to use only the APIs that can be proven to be supported on all platforms you've selected. This means that even though different platforms might support the same functionality using different APIs, none of these APIs can be used.

Note: In the spirit of writing down every method to achieve "portability", Microsoft recently launched support for what is known as "shared projects". Shared projects provide a simpler way to share sources between multiple projects, without having to manage and sync all of the project files.

The ways of the future

Now, before I get started describing the future, this is not yet set in stone. I had a lengthy discussion with @davidfowl the other day about this topic and what they're working on. As I've understood it (and don't go yelling at other people if it turns out I was wrong) most of the functionality for the next "portable thing" (and we'll get back to why I call it that later) are already specified, yet there are still some things left to decide. But I'll talk more about that when we get to how all this works. Here's what I'm certain about though: The way you do portability in the future will not be through a single dll that targets multiple platforms, but rather though multiple dlls, one for each platform.

Hold up a second, isn't that a step back? Well, no. Not really. Cause, all in all, the PCLs were made to solve the problem of having to maintain and configure a bunch of projects, and having to download and manage a bunch of different dependencies. Yet, in it's very noble attempt to do just that, it created a whole new problem (actually it created more than one, but we'll focus on the main one). Suddenly, writing code that were different on the different platforms became impossible, or extremely messy. The way you achieve different results on different platforms is through a technique known by many as bait and switch. The problem with this bait and switch technique though, is that you're right back to where you started. With multiple projects. In fact, you've worsened the problem. Instead of having n projects, where n is the targeted number of platforms, you have n + 1 projects, where one of them is a portable project (and doesn't really do anything). However; this bait and switch tactic shows us an important thing. It's quite possible to create a "portable thing", that runs on multiple platforms, and does platform dependent things. We've just changed this "portable thing" from being an assembly, to a nupkeg (if you don't know what a nupkeg is, go read my first blogpost.

Note: multi platform libraries, or MPLs is just a term I invented while writing this post. Others might have coined it before me, or it might not be used/understood by anybody. Whether you use it or not is entirely up to you, yet until something is handed to me from the great above (MS or otherwise) I'll continue to use it as it provides a good distinction to PCLs.

That's great for dependency management though, but what about creating "portable libraries", or as I will call them from here on out "multi platform libraries", or MPLs for short. If we're going to produce different assemblies for different platforms, that means we're right back to multiple projects, right? Well, yeah, sure, we could be, but what if we change our views a little, and go back to the start and set out to fix the problem with producing multiple assemblies from a single code-base? And this time, having learned from the great journey that PCLs where, what can we do to make it just as easy to produce MPLs? In fact, what we want is a single project, that with no configuration can produce and consume MPLs.

I want to explain this in a bit more depth, so let's go back to the Cat Image example. We're still targeting the same three platforms, and silverlight doesn't get the async variant of our method. Also, we're using another MPL to do most of the work, which is a Google (or Bing if that's your thing) image searcher and downloader named ImgSearch (fictitious too). So, we update our code to look like the following:

namespace CatFinder  
{
    public class Cat
    {
        public byte[] GetCatImage()
        {
            return ImgSearch.FindImage("cat");
        }

        #if !SL
        public async Task<byte[]> GetCatImageAsync()
        {
            return ImgSearch.FindImageAsync("cat");
        }
        #endif
    }
}

Now, just like our MPL, the ImgSearch MPL does not have it's async method if we're running on silverlight, so how would this even compile? Well, here's how it would work:

  1. Compilation of CatFinder is started.
  2. 3 target frameworks are found. net45, sl5 and wpa81.
  3. The compilers splits up into 3 compilations, one for each of the target platforms.
  4. First (note, the order here is of no importance, and is just exemplary) it kicks off the net45 compilation.
    1. net45 compilation starts.
    2. net45 compilation finds dependency MPL ImgSearch.
    3. Best match search (we'll get back to this later) of net45 for ImgSearch returns the net45 ImgSearch assembly.
    4. CatFinder for net45 is compiled against net45 with dependency ImgSearch for net45.
  5. Second, it compiles for sl5. This goes just as it did with net45, except the ImgSearch for sl5 is returned instead. This means that during the compilation of CatFinder for sl5, neither CatFinder, nor ImgSearch has any async methods. After it completes with the sl5 one, it continues on with wpa81, which goes just the same as the others.
  6. The three assemblies are packaged up into a nupkeg, and you've successfully made a MPL with only a single project.

vNext

But wait a second. This looks an awfully lot like what the new project system used for vNext does, right? And you're right. The new project system shipped with vNext does a lot of the things described above, with the exception that it does not support silverlight or windows phone. Or a lot of the other platforms you might want to make MPLs for. This is one of the reason why all of you should go to visual studios uservoice and vote for enabling project.json project format for all kinds of project types. But just because this is as of today handled by the new project system, doesn't mean it has to be. For instance, the project system used in vNext used to generate .csproj projects for each and every project it compiled, so it clearly can be done using msbuild. What's needed though, to get this working using normal visual studio project files, and compiling against targets such as xamarin and whatnot, is some new tooling. But still, I'd rather take the project.json step for all things, than MPL support in .csproj (that is, support for making, not consuming, as anything that supports nupkegs can consume MPLs as they too are just nupkegs), but that might just be me.

So, how does it all work?

While we've briefly discussed how MPLs work, to get a better understanding of some of the problems we're facing, and how they might be solved, we first need to take a look at how PCLs work (I'm skipping the multiple project variant here, as that's trivial). So, back again to our cat finder sample. We're just starting the project and selecting our target frameworks, and hitting the Ok button in Visual Studio. What happens under the hood, is that VS has a list of assemblies supported by each of the platforms. For instance, all three of them supports System, yet silverlight does not support whatever assembly(-ies) provide the async functionality. A list is generated of all the assemblies that are supported by all the selected target frameworks, and all of those are used to compile against.

Let me make that clear once more. If VS discovers 150 assemblies that all of the selected platforms support, it will reference all 150 of them during the compilation. This doesn't really matter after the compilation is done, because the once not used in our code doesn't get referenced in the resulting assembly, but it's still an interesting fact to know.

But first. To understand how PCLs work, we have to take a look at how assembly resolution and loading works. If we go back to our CatFinder example, and start a program based on it, CatFinder isn't going to be loaded until a method from CatFinder is invoked, or a type is referenced. That is, if I made a a program that has a button that downloads a cat image, CatFinder isn't going to be loaded until the user clicks on that button. In other words lazy loading. Next, when it actually do load CatFinder, it will load something like CatFinder, Culture=, PublicKeyToken=, Version=1.0.0.0. Assembly resolution is done in this order:

  1. If we're on desktop, and the assembly is strongly named, check the GAC (global assembly cache). If it exists in the GAC, use that one.
  2. If it did not exist in the GAC, check the application base directory. This does matching against paths such as AppBase\CatFinder.dll and AppBase\CatFinder\CatFinder.dll (and some more).
  3. Lastly, if none of these are found, an exception is triggered, that gets intercepted by the AppDomain.AssemblyResolve event, where you have a last chance to in-code find the assembly yourself. If you don't, the exception continues, and the load fails.

Now, say we created a windows phone app, and a normal windows app that both had a button that went and downloaded a cat image. These apps are no longer PCLs, there's one for phone, and one for desktop. Both of them references the CatFinder PCL. The CatFinder PCL as we know depend on the ImgSearch library (that we'll ignore for now), and since it uses strings and objects etc, it also depends on System.Runtime (and some other framework libraries). The CatFinder library is the only PCL referenced from both of our applications, and both of them only reference the minimal subset of framework references to work (typical mscorlib and similar).

On windows phone, when we click the button and CatFinder is loaded, the sequence of actions taken looks something akin to this (albeit simplified, and several libraries like System.Threading.Task is ignored):

  1. Request for CatFinder is started.
  2. CatFinder is found in the application directory.
  3. The Cat class is used, which is a subclass of System.Object defined in System.Runtime.
  4. System.Runtime (which is probably already loaded, but we're ignoring that) is requested.
  5. System.Runtime is resolved in the framework assemblies.
  6. System.Object is retrieved from System.Runtime.
  7. The Cat class is ready for use.

Note: It's quite possible that Windows Phone uses a facade assembly for System.Runtime, just as desktop does, and that the following example is a bad one. Yet it still describes how this would work for something that's not a facade, yet the PCL actually mirrors the implementation in the framework.

On the desktop, things can be a bit more problematic. Cause the frameworks on the desktop are't really split into nice packages like System.Runtime. Those came afterwards. Therefore to get this working on desktop we need a facade assembly. A facade assembly is just what it sounds, it's an assembly "pretending" to be an API surface, yet in truth it just delegates the implementations elsewhere. The System.Runtime facade assembly on desktop is just a bunch of typeforwards into mscorlib. On the desktop, the same process might look something like this:

  1. Request for CatFinder is started.
  2. CatFinder is found in the application directory.
  3. The Cat class is used, which is a subclass of System.Object defined in System.Runtime.
  4. System.Runtime is requested.
  5. System.Runtime is resolved in the GAC.
  6. System.Object is registered as a typeforward in System.Runtime to mscorlib.
  7. mscorlib is requested (this one is definitely already loaded).
  8. mscorlib is returned.
  9. System.Object is retrieved from mscorlib.
  10. The Cat class is ready for use.

The important thing to grasp from this description of loading is the fact that it happens dynamically at runtime. Even though we used the same CatFinder assembly in both examples, the resulting System.Runtime assembly was different. Heck, the base class of Cat even came from different assemblies! This is quite important with regards to how PCLs work. The point to take away here is that there isn't really anything magic about the PCLs themselves that makes them "portable", rather they just use (or abuse) the fact that resolution of framework assemblies happens at runtime. So if we know that the runtime has support for async, we can include it in our assembly, and it will "Just Work".

So, we've talked about how PCLs work at compile-time, and at run-time. However, there's one more "time" that's important to the life-cycle of PCLs, that most doesn't think about, and that is install-time. If you have a PCL as a single dll, or a project reference, then "install" is simply adding that to your csproj, and that's it. However, what makes the "Bait and Switch" work is the fact that framework resolution happens at install time when using "portable nupkegs". If the nupkeg only contains a portable assembly, that that's used, simple and plain. However, if there is an assembly targeting our specific framework, that that's used instead. This is precisely what the "Bait and Switch" relies on. But if you have a nupkeg, with different assemblies for different targets, isn't that basically a MPL?

As it turns out. If you take a bait and switch nupkeg, and removes the portable assembly from it, you have an MPL, more or less, so as you can see, they are quite alike. But if so, what's the benefit of using MPLs? At runtime, as far as I've understood it, there isn't really one. Both MPLs and PCLs will do the same thing, cause in the end they are simply IL instructions and metadata. Types and methods, that works with other types and methods, and does whatever you programmed them to do. As I've already said, there's not really anything magic about an PCL that makes it "portable". The difference is during compile-time (or design-time), and install-time. We've already looked at most of the compile time differences, like the fact that MPLs enable us to alter what code is used for different frameworks, reflecting the fact that some frameworks have capabilities that others doesn't. This is without having to create a bunch of different projects, and while maintaining easy dependency management. In other words, it's just about tooling.

However, there is one large problems with PCLs and the bait and switch mechanism, that's fundamentally flawed. Say we are working on a new project, that uses our CatFinder, namely StripedCatFinder. StripedCatFinder much like CatFinder provides a single function that goes online and finds an image of a striped cat. This is done by invoking the CatFinder function, and doing some image magic to analyze whether or not the cat we found are in fact striped. If it's not it just loops and attempts again.

As you may recall, the CatFinder uses another library that we got from NuGet, named ImgSearch. ImgSearch is a bait and switch nupkeg, where the portable dll is just empty implementations. StripedCatFinder, in contrast to CatFinder however, is a net45 only project, so it's not a PCL. So we have StripedCatFinder, which is a net45 project that depends on CatFinder, which is a regular PCL project, which in turn depends on ImgSearch which is a bait and switch nupkeg. Now, if we compile StripedCatFinder, it's going to initiate a compilation of CatFinder, which is going to compile CatFinder using the PCL assembly from ImgSearch. Then, those two assemblies are going to be used to compile against by StripedCatFinder. Now here comes the problem. The PCL assembly of ImgSearch does, as said, not contain any actual implementations. Normally, this isn't a problem, since the assembly is never used, however in the case where you have a PCL project that is being referenced by a non PCL project, this whole thing falls apart. Because, while StripedCatFinder should have the working net45 version of ImgSearch, it instead get's the non-working PCL version, which will blow up at run time.

borked bait and switch

Note: I've been told that work is being done on nuget and msbuild to mitigate this problem, but at the time of writing, this is a serious problem that can cause huge errors that the compiler does not pick up.

As MPLs are implemented by ASP.NET vNext today, they do not have this problem. However, the reason they do not have this problem is in itself another bug. vNext does dependency graph walking at runtime to do runtime compilation of the projects, so it will get the net45 version of ImgSearch, however, it will also (wrongly) compile CatFinder against the net45 version of ImgSearch, instead of the PCL one. This means that even though CatFinder is a PCL, you get "different" versions of it depending on which project or framework is requesting it, which can cause bugs and differences that should not really be there. This isn't really a problem though, as we transition to making MPLs instead of PCLs. Because if CatFinder here was an MPL instead of a PCL we wouldn't have the problem.

Install-time wise, MPLs currently behave exactly like a bait and switch nupkeg, because when you use a bait and switch nupkeg, the portable assembly is never really used. There is however one thing about MPLs that isn't addressed yet, which is also in the end just about tooling, and it's about how NuGet handles framework resolution during installation of a nupkeg.

Installing nupkegs

Whenever you install a nupkeg into any project it runs an algorithm to determine just what assemblies needs to be referenced by your project. This algorithm handles things like if you're on net45, and there is only a net40 assembly, it's selected, and if there's both a portable and a net45 assembly it picks the net45 one. Today that algorithm works by having a list of all the possible frameworks, like net45, wpa, xamarin.android+++, and it knows which ones of these that works with each other. Just like it knows that portable-net45+wpa works on both net45, and wpa.

Note: When you install nupkegs into k projects, nothing really happens. K does dependency walking at runtime, meaning what I'm describing about it picking netcore45 happens at run-time, not really install-time. So, when working with K, there isn't really much of an install-time, everything I'm describing here happens at run-time instead.

However, a problem arises whenever a new framework is added, like k10 which is the new Cloud Optimized (or Core) CLR. k10 is a basically superset of netcore45. Currently, in vNext, this is hard-coded, so that if you try to use a nupkeg which has netcore45 and no k10, it'll use the netcore45 dll. However, having to hard-code this is inconvenient, cause that means changes have to be made to NuGet (and the runtime loader). This is due to the fact that NuGet currently has no way to know what's compatible, with exception of the hard-coded list it has.

What is currently being discussed though, is a way to remedy this. Instead of just "knowing" that k10 is compatible with netcore45, wouldn't it be better if it was possible to test if CatFinder for sl5 would work on k10? This brings us to one of the more advanced subjects with regards to PCLs and MPLs, namely "contract assemblies".

Contract Assemblies

Contract assemblies are basically like the empty PCL assemblies found in bait and switch nupkegs, with the exception that instead of describing the capabilities of a project, they are used to describe the capabilities of a framework. What's done is basically the same thing which was done with PCLs first started out. Take a good, long, hard look at the .NET Framework, and figure out logical "parts" which we can split it up into. So that we can say that windows phone supports crypto and http, whereas silverlight does not (as an example, this probably does not hold true). However, with the contract assemblies they've taken this a step further, and increased the granularity of the packages. For instance, if we take a look at the System.Security packages that exist as part of the Core CLR today (note that these are subject to change, and probably will do so, until we reach a stable version) we have the following:

  • System.Security.Cryptography.DeriveBytes
  • System.Security.Cryptography.Encoding
  • System.Security.Cryptography.Encryption
  • System.Security.Cryptography.Encryption.Aes
  • System.Security.Cryptography.Encryption.RSA
  • System.Security.Cryptography.Hashing
  • System.Security.Cryptography.Hashing.Algorithms
  • System.Security.Cryptography.RandomNumberGenerator
  • System.Security.Cryptography.X509Certificates
  • System.Security.Principal
  • System.Security.SecureString

Splitting up the .NET framework into these small packages allows us with higher finesse to describe what frameworks a package will work on. For instance, if you were to write a package that requires System.Security.Cryptography.Encryption, it should be able to run on any CLR that supports that. The idea here is that instead of saying that our package works on net45 and Windows Phone, we say that our package works on anything that has support for System.Security.Cryptography.Encryption. However, remember that currently, this is just as a fallback to the other algorithm used to find fitting packages.

Currently, two ways have been proposed to add support for contract assemblies. One is in use with k today, which is putting these contract assemblies into nupkegs and publishing them on a nuget feed. If you were to download, say, System.Console and look at it, you would see that it contains a lib/contracts folder, which contains an assembly with empty methods. This is a contract assembly, used to compile against. And later it will be used to tell that in order to use the following library, the framework you are running on must support System.Console.

contract assembly

The advantage of this way is that it works without changing NuGet. Of cause, the fallback algorithm that determine which dll to use based on contract assemblies would still have to be added to NuGet, but these contract assemblies are already used for compiling against the Core CLR today, and they are doing so without having to change NuGet. The drawback is that you end up with a bunch of nupkegs.

The other way is to add a new metadata field to NuGet, which is the contract assemblies, and works a lot like the frameworkReference field works today. If you haven't noticed, NuGet has support for adding frameworkReferences to a .nuspec, which lists the framework assemblies that should be referenced by the project using the nupkeg. Contract assemblies could work just the same way, with the exception that they have version included. The advantage of having the contract assemblies as a custom field in the nupkeg would be that it wouldn't be downloaded as a dependency, so you'd get faster restore of packages.

Wrapping up

This has been an enormous blog post (just about reaching 5000 words, and probably will have by the time I'm done). I've gone through some highly technical details about how PCLs work, and how they should work (MPLs). If there's anything I've left out that you would like me to explain, or something that you've not understood, please send me an email at alxandr alxandr me, or create an issue over at github. And if you see anything that's wrong, please do contact me. Thanks for reading.

comments powered by Disqus