Alxandr.me

/* TODO: Implement */


vNext with a touch of functional

Previously I've explained how in vNext it's possible to create custom "loaders" that enables one to do different languages (among other things) in vNext. Today I'm going to talk about what went into the creation of FSharpSupport, how to use it, and some of the speed bumps along the way.

But first of all, I need to talk a bit about how vNext have changed, and you no longer create loaders, but project reference providers. That's a bit hard to swallow, but hopefully it'll make sense in a little while.

The problems with project loaders

Project loaders, as they where, had one major flaw. They only worked during runtime. There where no (simple or built in) way to take a project loader, and have it generate a MPL for you. This meant that you only had source-distribution. This was a well known problem though, and something everyone knew had to be resolved at one point, yet how was not yet clear by then. Over the last few weeks Fowler have had more than a few epiphanies about how to best do this, and there's been a lot of refactoring done with regards to how projects are compiled both at runtime and compile time. This has lead to more than just a few explosions along the way. Also, due to the volatile nature of vNext, this might very well have changed again by the time you read this blog post. Heck, it might even change by the time I'm done writing this blog post.

Project references

Today in vNext, this works like the following:
1. A project consists of sources, configuration, dependencies and exports.
2. All dependencies given to a compilation are just metadata references.

References can be either one of the following:
1. IMetadataFileReference: Reference to a single .dll. This is mainly used to reference assemblies found in nupkegs
2. IMetadataEmbeddedReference: References an embedded .dll. This is mainly used for Assembly Neutral Interfaces.
3. IMetadataProjectReference: References a project. This is the important interface we will have to provide to get new language support.

The IMetadataProjectReference interface looks like this:

namespace Microsoft.Framework.Runtime  
{
    [AssemblyNeutral]
    public interface IMetadataReference
    {
        string Name { get; }
    }

    [AssemblyNeutral]
    public interface IMetadataProjectReference : IMetadataReference
    {
        string ProjectPath { get; }

        IDiagnosticResult GetDiagnostics();

        IList<ISourceReference> GetSources();

        Assembly Load(IAssemblyLoaderEngine loaderEngine);

        void EmitReferenceAssembly(Stream stream);

        IDiagnosticResult EmitAssembly(string outputPath);
    }
}

The Name and ProjectPath properties are straight forward. They get the name and project path of the project. GetSources is also fairly easy. It returns a list of all the sources used in the compilation. This is used to do stuff like generate symbol packages and show list of sources in Visual Studio. GetDiagnostics is used by the Design Time Host (which is used by VS among other things to provide diagnostics and information about the projects). The GetDiagnostics method returns diagnostic information like warnings and errors to the caller. EmitAssembly does mostly the same as GetDiagnostics, with the exception that it also emits an assembly to the specified location. The location provided is a directory, so it's possible to output multiple files (like MyProject.dll, MyProject.xml and MyProject.pdb). The Load method is used to do runtime loading of assemblies. This can (and normally does) include compilation. Lastly, there's EmitReferenceAssembly. A reference assembly is an assembly that only has metadata. It has the types and methods, but not their implementations. This is not necessarily the case with results from EmitReferenceAssembly though, as some compilers might not support outputting reference assemblies, they just output regular assemblies, which works just fine. EmitReferenceAssemblies is used when your project is used as a dependency of another project, and that other project is about to do a compilation. It then needs the metadata (types, interfaces, etc) from our project to be able to compile against it.

Project reference provider

To kick-start the whole thing another interface is needed, namely IProjectReferenceProvider. It's responsible for providing project references on demand. This is the type we inject into the vNext pipeline in order for it to support our languages.

Building support for F#

As the project reference provider is the first type we need, and the simplest, let's start with that. The IProjectReferenceProivder just have a single method; GetProjectReference. GetProjectReference takes a few input parameters with information about the project currently being built, and returns a IMetadataProjectReference. The interface looks like this:

namespace Microsoft.Framework.Runtime  
{
    public interface IProjectReferenceProvider
    {
        IMetadataProjectReference GetProjectReference(
            Project project, 
            FrameworkName targetFramework, 
            string configuration, 
            IEnumerable<IMetadataReference> incomingReferences,
            IEnumerable<ISourceReference> incomingSourceReferences,
            IList<IMetadataReference> outgoingReferences);
    }
}

The project is basically a representation of the project.json file. The targetFramework is which framework we're targeting (like net45 or k10). configuration is the current configuration (like debug or release). incomingReferences are metadata dependencies (libraries on disk or in memory) and incomingSourceReferences are shared sources. Lastly, outgoingReferences is used to provide your own ANIs. To get started with F# support, an implementation of the IProjectReferenceProvider must be provided. Stealing some code from Fowler, we end up with this:

using System.Collections.Generic;  
using System.Runtime.Versioning;  
using Microsoft.Framework.Runtime;

namespace FSharpSupport  
{
    public class FSharpProjectReferenceProvider : IProjectReferenceProvider
    {
        public IMetadataProjectReference GetProjectReference(
            Project project, 
            FrameworkName targetFramework, 
            string configuration, 
            IEnumerable<IMetadataReference> incomingReferences, 
            IEnumerable<ISourceReference> incomingSourceReferences, 
            IList<IMetadataReference> outgoingReferences)
        {
            // Represents the project reference
            return new FSharpProjectReference(project, targetFramework, configuration, incomingReferences);
        }
    }
}

This implementation is really simple. It simply takes the part we need from the project info we get, and hand it off to a FSharpProjectReference. With not much else going on here, let's dive straight into that:

private readonly Project _project;  
private readonly FrameworkName _targetFramework;  
private readonly string _configuration;  
private readonly IEnumerable<IMetadataReference> _metadataReferences;

public FSharpProjectReference(Project project,  
                              FrameworkName targetFramework,
                              string configuration,
                              IEnumerable<IMetadataReference> metadataReferences)
{
    _project = project;
    _targetFramework = targetFramework;
    _configuration = configuration;
    _metadataReferences = metadataReferences;
}

public string Name  
{
    get { return _project.Name; }
}

public string ProjectPath  
{
    get { return _project.ProjectFilePath; }
}

public IList<ISourceReference> GetSources()  
{
    return _project.SourceFiles
                   .Select(p => (ISourceReference)new SourceFileReference(p))
                   .ToList();
}

I've added the trivial and boiler-plate code in already. Next up we need to start implementing some of the more advanced IMetadataProjectReference members. However, as discussed in the previous section, a lot of the methods actually do the same more or less, especially when working with an external compiler like fsc, that does not take in-memory references, and does not spit out in-memory assemblies. So the first thing we'll do is write a general purpose Emit method.

Emitting assemblies

The general purpose Emit method has the following signature:

public IDiagnosticResult Emit(string outputPath, bool emitPdb, bool emitDocFile, bool emitExe = false)  

It takes an output path, whether or not to emit a pdb-file and a doc-file, and if the resulting library should be an executable or a dll. It uses these to construct command-line arguments to use with fsc (through the use of FSharp.Compiler.Service).

The first thing needed to emit the assembly is to figure out where to put temp-files, and where to do with the resulting library file. Therefore, the first two methods of the Emit method are:

var tempBasePath = Path.Combine(outputPath, _project.Name, "obj");  
var outputDll = Path.Combine(outputPath, _project.Name + (emitExe ? ".exe" : ".dll"));  
Directory.CreateDirectory(tempBasePath);  

Next up, the default command line arguments are added together in a list:

var fscArgBuilder = new List<string>();  
fscArgBuilder.Add("fsc.exe");  
fscArgBuilder.Add("--noframework");  
fscArgBuilder.Add("--nologo");  
fscArgBuilder.Add("--out:" + outputDll);  
fscArgBuilder.Add("--target:" + (emitExe ? "exe" : "library"));  

And handle the cases with pdbs and xml-docs:

if (emitPdb)  
{
    var pdb = Path.Combine(outputPath, _project.Name + ".pdb");

    fscArgBuilder.Add("--debug");
    fscArgBuilder.Add("--optimize-");
    fscArgBuilder.Add("--tailcalls-");
    fscArgBuilder.Add("--pdb:" + pdb);
}

if (emitDocFile)  
{
    var doc = Path.Combine(outputPath, _project.Name + ".xml");

    fscArgBuilder.Add("--doc:" + doc);
}

After that's done, we need to add all of the source files. That's done easily by the following command:

fscArgBuilder.AddRange(_project.SourceFiles);  

Then comes references. References are more complicated, because as said earlier, there are 3 different kinds of references we need to account for. The ones that are from nupkegs are easy, because they are already assemblies on disk (and we just get a path to them). However, embedded references, and project references has to be written to disk. We also want to make sure that the FSharpSupport assembly itself isn't added to the compilation, as it's not intended to be used as an API. This will probably not be necessary once vNext gets support for development dependencies, however as of today this still needs to be handled separately. Also, the files we put on disk to use in the compilation should be cleaned up afterwards, so that we don't keep producing assemblies that just lay around forever.

So, to get started, we need to loop through all of the given references and handle them one by one, and if it's the FSharpSupport itself, we just skip it:

var tempFiles = new List<string>();

// These are the metadata references being used by your project.
// Everything in your project.json is resolved and normailzed here:
// - Project references
// - Package references are turned into the appropriate assemblies
// - Assembly neutral references
// Each IMetadaReference maps to an assembly
foreach (var reference in _metadataReferences)  
{
    // Skip this project
    if (reference.Name == typeof(FSharpProjectReference).Assembly.GetName().Name)
    {
        continue;
    }

First of, is the nupkeg assemblies. As said, those are easy, and we simply need to add them to our compilation:

// NuGet references
var fileReference = reference as IMetadataFileReference;  
if (fileReference != null)  
{
    fscArgBuilder.Add(@"-r:" + fileReference.Path);
}

Next up is assembly neutral interfaces (or embedded assemblies in general). They are delivered as a byte array, and have to be written to disk prior to compiling. They also need to be included in the temp-files, so they get removed after the compilation:

// Assembly neutral references
var embeddedReference = reference as IMetadataEmbeddedReference;  
if (embeddedReference != null)  
{
    var tempEmbeddedPath = Path.Combine(tempBasePath, reference.Name + ".dll");

    // Write the ANI to disk for fsc
    File.WriteAllBytes(tempEmbeddedPath, embeddedReference.Contents);

    fscArgBuilder.Add("-r:" + tempEmbeddedPath);

    tempFiles.Add(tempEmbeddedPath);
}

And lastly, there's the project references. Project references are handled just like the embedded assemblies, with the exception that they are written to disk differently, by means of the EmitReferenceAssembly method:

// Project references
var projectReference = reference as IMetadataProjectReference;  
if (projectReference != null)  
{
    // You can write the reference assembly to the stream
    // and add the reference to your compiler

    var tempProjectDll = Path.Combine(tempBasePath, reference.Name + ".dll");

    using (var fs = File.OpenWrite(tempProjectDll))
    {
        projectReference.EmitReferenceAssembly(fs);
    }

    fscArgBuilder.Add(@"-r:" + tempProjectDll);

    tempFiles.Add(tempProjectDll);
}

With that done, all that is left is to run the actual compilation. This is done with types from the FSharp.Compiler.Service nupkeg like so:

var scs = new SimpleSourceCodeServices();  
var result = scs.Compile(fscArgBuilder.ToArray());  

The last piece of the puzzle is to filter out warnings and errors, and produce the diagnostics result. And for the case of simplifying debugging, we only delete the temp files if the compilation succeeded. Add the following few lines, and the method is done:

var warnings = result.Item1.Where(i => i.Severity.IsWarning).Select(i => i.ToString()).ToArray();  
var errors = result.Item1.Where(i => i.Severity.IsError).Select(i => i.ToString()).ToArray();

if (result.Item2 != 0)  
{
    return new DiagnosticResult(success: false, warnings: warnings, errors: errors);
}

// Nuke the temporary references on disk
tempFiles.ForEach(File.Delete);

Directory.Delete(tempBasePath);

return new DiagnosticResult(success: true, warnings: warnings, errors: errors);  

And there we have it. A complete and working Emit method that can compile our F# sources and give back errors and warnings. Now we just need to do the implementations of IMetadataProjectReference that uses the Emit method, which is now significantly easier.

First there's Load. It emits the assembly to a temp location, then loads it.

public Assembly Load(IAssemblyLoaderEngine loaderEngine)  
{
    string outputDir = Path.Combine(Path.GetTempPath(), "dynamic-assemblies");

    var result = Emit(outputDir, emitPdb: true, emitDocFile: false);

    if (!result.Success)
    {
        throw new CompilationException(result.Errors.ToList());
    }

    var assemblyPath = Path.Combine(outputDir, _project.Name + ".dll");

    return loaderEngine.LoadFile(assemblyPath);
}

EmitReferenceAssembly does much the same as Load, but it puts the files in a randomly generated temp location, and cleans up after it's done.

public void EmitReferenceAssembly(Stream stream)  
{
    string outputDir = Path.Combine(Path.GetTempPath(), "reference-assembly-" + Guid.NewGuid().ToString());

    try
    {
        var result = Emit(outputDir, emitPdb: false, emitDocFile: false);

        if (!result.Success)
        {
            return;
        }

        using (var fs = File.OpenRead(Path.Combine(outputDir, _project.Name + ".dll")))
        {
            fs.CopyTo(stream);
        }
    }
    finally
    {
        Directory.Delete(outputDir, true);
    }
}

GetDiagnostics also puts the files in a temp place, but just deletes it when it's done. It doesn't really care about the files, instead it just wants the compiler output (warnings and errors):

public IDiagnosticResult GetDiagnostics()  
{
    string outputDir = Path.Combine(Path.GetTempPath(), "diagnostics-" + Guid.NewGuid().ToString());

    try
    {
        return Emit(outputDir, emitPdb: false, emitDocFile: false);
    }
    finally
    {
        Directory.Delete(outputDir, true);
    }
}

And that's it. As of today, that's all you need to implement a new language in K. And while this might look like a daunting task, it's less so than one might think. However, for those asking for VB.NET support, wait a short while, because VB.NET support can be made quite awesome, but it requires an update to Roslyn and K which is coming down the line.

Usage

Using the new "language service", "project reference provider" or whatever you want to call it is quite easy. Create a new project, add the current project as a dependency, and add a "language" key to the project.json. You also need to add sources, since the default one just goes looking for *.cs. This is also quite important in F#, since file order matters.

The end-result looks something like this:

{
    "dependencies": {
        "FSharpSupport": ""
    },
    "code": [ "file1.fs", "file2.fs" ],
    "language": {
        "name": "F#",
        "assembly": "FSharpSupport",
        "projectReferenceProviderType": "FSharpSupport.FSharpProjectReferenceProvider"
    },
    "frameworks": {
        "net45": { }
    }
}

Next time, I'll show how I was able to implement the F# support in F#. Also, if there's any questions, please leave me a comment, or send me an email.

comments powered by Disqus