Nullable vs null - fmork.net

One thing that I sometimes see people wonder about is how Nullable relates to null. This blog post is an attempt to explain how it works behind the scenes.

In order to understand this text, you should have basic understanding about the difference between reference types and value types in .NET, along with understanding about why value types cannot be null, while reference types can. If you don't feel that you do know this, you can read up a bit on that topic here: Parameter passing in C# (by Jon Skeet). You should also have basic knowledge about generics (no advanced stuff; just enough to understand the syntax around them).

Declaring

In C# there are two ways to declare members using Nullable (I am using int for the examples, but it could be any value type):

    private Nullable a = null;
    private int? b = null;

The two code lines above are both declaring the exact same type: Nullable. The second, shorter declaration is simply syntactic sugar that is translated into Nullable by the C# compiler. There is no type difference of any kind between the two styles (the ultimate proof for this statement is that object.ReferenceEquals(typeof(Nullable), typeof(int?)) returns true), it's just a matter of... well, style. In this blog post I will stick to the longer version of the two for the sake of clarity.

Not all nulls are equal

The type Nullable is in itself a value type. It is a struct, and exhibits the same behavior as other value types. One of the characteristics of a value type is its inability to be null. So, how is it that the above declarations function at all? Should the compiler not issue a compilation error? Why can we suddenly assign null to a variable that is of a value type?

In short; we can't. It's a compiler trick. Let's examine this a bit closer. I will first look at how a reference type behaves:

    private static string GetNullString()
    {
        return null;
    }

A simple method, returning a string, in which the body returns null. The IL code for the method looks like this:

    .method private hidebysig static string  GetNullString() cil managed
    {
      // Code size       2 (0x2)
      .maxstack  8
      IL_0000:  ldnull
      IL_0001:  ret
    } // end of method TestClass::GetNullString

What the above IL code does is to push a null to the stack (ldnull) and then returning it to the caller (ret). Now, let's look at a similar function returning an Nullable:

    private static Nullable GetNullInt()
    {
        return null;
    }

...and the IL code:

    .method private hidebysig static valuetype [mscorlib]System.Nullable`1 
            GetNullInt() cil managed
    {
      // Code size       10 (0xa)
      .maxstack  1
      .locals init ([0] valuetype [mscorlib]System.Nullable`1 CS$0$0000)
      IL_0000:  ldloca.s   CS$0$0000
      IL_0002:  initobj    valuetype [mscorlib]System.Nullable`1
      IL_0008:  ldloc.0
      IL_0009:  ret
    } // end of method TestClass::GetNullInt

Notice how the method initializes a new instance of the type Nullable and returns it. There is no trace of null at all. The instance is created using the initobj instruction, which initializes the fields of the type to either a null reference (if the field is of a reference type) or the default value of the type (if the field is of a value type).

When you in your code assign null to a Nullable (or another nullable value type), the compiler will emit IL code that initializes a new instance of Nullable and assign that instead. So, we do in fact get an object instance even though we assign null. This can be verified by running the following code:

    Nullable a = null;
    Console.WriteLine(a.HasValue);  // prints False

Assign null to a variable, and then try to access an member through that variable. This kind of code would typically throw a NullReferenceException, but in the case of the Nullable, it's perfectly legal.

Then what happens then if we assign a value that is not null?

    private static Nullable GetNonNullInt()
    {
        return 5;
    }

The generated IL code:

    .method private hidebysig static valuetype [mscorlib]System.Nullable`1 
            GetNonNullInt() cil managed
    {
      // Code size       7 (0x7)
      .maxstack  8
      IL_0000:  ldc.i4.5
      IL_0001:  newobj     instance void valuetype [mscorlib]System.Nullable`1::.ctor(!0)
      IL_0006:  ret
    } // end of method TestClass::GetNonNullInt

In this method the Nullable instance is created in a different manner; first the method pushes an integer of the value 5 onto the stack (ldc.i4.5), then a new object is created and has a constructor invoked. The constructor is taking one parameter of the type !0 (OK, that's a weird type; it's a trick that is used by the JIT compiler when generating the concrete type, but that is outside the scope for this text; for now let's just pretend it says int32). The point here is that in the first case, when we assign null, we get an instance of Nullable that is created in a way where its fields are initialized to their default values for the respective type, while in the case where we assign an integer, the instance is initialized using a constructor to which the value is passed. This difference is one part of the magic of Nullable.

Now we have established that when it comes to assigning null to Nullable, it's all in the compiler. It has knowledge about this specific type, and gives it special treatment. We write Nullable a = null;, but the compiler changes it into Nullable a = new Nullable();

Comparing to null

So, what about comparisons?

   private static void TestForNull()
    {
        Nullable a = null;
        if (a == null)
        {
            Console.WriteLine("a is null");
        }
    }

Didn't we just establish that a in this case is indeed an instance of Nullable. If that is the case, how can a comparison with null evaluate to true? Again, it's a compiler trick. The answer is to be found in the IL code:

    .method private hidebysig static void  TestForNull() cil managed
    {
      // Code size       28 (0x1c)
      .maxstack  1
      .locals init ([0] valuetype [mscorlib]System.Nullable`1 a)
      IL_0000:  ldloca.s   a
      IL_0002:  initobj    valuetype [mscorlib]System.Nullable`1
      IL_0008:  ldloca.s   a
      IL_000a:  call       instance bool valuetype [mscorlib]System.Nullable`1::get_HasValue()
      IL_000f:  brtrue.s   IL_001b
      IL_0011:  ldstr      "a is null"
      IL_0016:  call       void [mscorlib]System.Console::WriteLine(string)
      IL_001b:  ret
    } // end of method TestClass::TestForNull

What happens here is that a new Nullable is created using the initobj instruction (lines IL_0000 - IL_0002), then the getter for the HasValue property is invoked, and the result is pushed to the evaluation stack (lines IL_0008 - IL_000a). The brtrue.s instruction will transfer control to the address specified after it (IL_001b) if the current value on the evaluation stack is true. The question then is, what is the current value on the evaluation stack when the brtrue.s instruction is executed? Well, since the Nullable object was created using the initobj instruction, HasValue is false (the default value for a Boolean field). Again, where our code makes a comparison to null, the compiler replaces it with something else. The code above is equivalent to the following:

   private static void TestForNull()
    {
        Nullable a = null;
        if (!a.HasValue)
        {
            Console.WriteLine("a is null");
        }
    }

Before we wrap up, let's check one last oddity:

    private static void TestForNull()
    {
        Nullable a = null;
        if (a.Equals(null))
        {
            Console.WriteLine("a is null");
        }
    }

...and the IL version:

    .method private hidebysig static void  TestForNull() cil managed
    {
      // Code size       35 (0x23)
      .maxstack  2
      .locals init ([0] valuetype [mscorlib]System.Nullable`1 a)
      IL_0000:  ldloca.s   a
      IL_0002:  initobj    valuetype [mscorlib]System.Nullable`1
      IL_0008:  ldloca.s   a
      IL_000a:  ldnull
      IL_000b:  constrained. valuetype [mscorlib]System.Nullable`1
      IL_0011:  callvirt   instance bool [mscorlib]System.Object::Equals(object)
      IL_0016:  brfalse.s  IL_0022
      IL_0018:  ldstr      "a is null"
      IL_001d:  call       void [mscorlib]System.Console::WriteLine(string)
      IL_0022:  ret
    } // end of method TestClass::TestForNull

This method is similar to the previous one we looked at, but instead of invoking the HasValue property getter, the Equals method is invoked. The Equals method is overridden by the Nullable type, a short and simple implementation, looking like this:

    public override bool Equals(object other)
    {
        if (!this.HasValue)
        {
            return (other == null);
        }
        if (other == null)
        {
            return false;
        }
        return this.value.Equals(other);
    }

Or, in plain English: if HasValue of the current instance is false, and other is null, let's say we are equal. Otherwise, if other is null, let's say we are not equal. Otherwise, we call the Equals method of the object in the current instance's value field, passing other to it. This Equals implementation means that Nullable will pretend to be null if HasValue is false.

As a side note it can be noted that this implementation of Equals actually breaks the contract stipulated by the documentation of Object.Equals. There it is noted that (amongst other things) x.Equals(null) must return false.

Conclusion

Bottom line; Nullable allows you to write code as if it could be null, but a combination of overrides and compiler wizardry translates our code into statements dealing with a regular value type instance, behaving like any other value type instance.