C#, pronounced "C sharp," is a new programming language that makes it easier for C and C++ programmers to generate COM+-ready programs. In addition, C# has been built from the ground up to make it easier to write and maintain programs. It's a little like taking all the good stuff in Visual Basic® and adding it to C++, while trimming off some of the more arcane C and C++ traditions.
Going forward, C# is expected to be the best language from Microsoft® for writing COM+ and Windows®-based programs for enterprise computing. However, you won't have to migrate your existing C or C++ code. If you like the new features in C#—and you probably will—you can migrate your mindset to C#. Let's face it—C++ is a powerful language, but it isn't always a walk in the park. I've used both Visual Basic and C++ professionally, and after a while I was asking myself why I needed to implement every last destructor for every last C++ class. C'mon already—you're a smart language. Visual C++® even has IntelliSense®. Clean up after me. If you like C and C++, but sometimes think like I do, C# is for you.
The main design goal of C# was simplicity rather than pure power. You do give up a little processing power, but you get cool stuff like type safety and automatic garbage collection in return. C# can make your code more stable and productive overall, meaning that you can more than make up that lost power in the long run. C# offers several key benefits for programmers
What's one of the most annoying things about working in C++? It's gotta be remembering when to use the -> pointer indicator, when to use the :: for a class member, and when to use the dot. And the compiler knows when you get it wrong, doesn't it? It even tells you that you got it wrong! If there's a reason for that beyond out-and out taunting, I fail to see it.
C# recognizes this irksome little fixture of the C++ programming life and simplifies it. In C#, everything is represented by a dot. Whether you're looking at members, classes, name-spaces, references, or what have you, you don't need to track which operator to use.
Okay, so what's the second most annoying thing about working in C and C++? It's figuring out exactly what type of data type to use. In C#, a Unicode character is no longer a wchar_t, it's a char. A 64-bit integer is a long, not an __int64. And a char is a char is a char. There's no more char, unsigned char, signed char, and wchar_t to track. I'll talk more about data types later in this article.
The third most annoying problem that you run across in C and C++ is integers being used as Booleans, causing assignment errors when you confuse = and ==. C# separates these two types, providing a separate bool type that solves this problem. A bool can be true or false, and can't be converted into other types. Similarly, an integer or object reference can't be tested to be true or false—it must be compared to zero (or to null in the case of the reference). If you wrote code like this in C++: int i;
if (i) . . .
You need to convert that into something like this for C#: int i; if (i != 0) . . .
Another programmer-friendly feature is the improvement over C++ in the way switch statements work. In C++, you could write a switch statement that fell through from case to case. For example, this code switch (i)
{ case 1: FunctionA(); case 2: FunctionB(); Break; }
would call both FunctionA and FunctionB if i was equal to 1. C# works like Visual Basic, putting an implied break before each case statement. If you really do want the case statement to fall through, you can rewrite the switch block like this in C#: switch (i)
{ case 1: FunctionA(); goto case 2; case 2: FunctionB(); Break; }
Consistency C# unifies the type system by letting you view every type in the language as an object. Whether you're using a class, a struct, an array, or a primitive, you'll be able to treat it as an object. Objects are combined into namespaces, which allow you to access everything programmatically. This means that instead of putting includes in your file like this #include
#include #include
you include a particular namespace in your program to gain access to the classes and objects contained within it: using System;
In COM+, all classes exist within a single hierarchical namespace. In C#, the using statement lets you avoid having to specify the fully qualified name when you use a class. For example, the System namespace contains sev-eral classes, including Console. Console has a WriteLine method that, as you might expect, writes a line to the system console. If you want to write the output part of a Hello World program in C#, you can say: System.Console.WriteLine("Hello World!");
This same code can be written as: Using System; Console.WriteLine("Hello World!");
That's almost everything you need for the C# Hello World program. A complete C# program needs a class definition and a Main function. A complete, console-based Hello World program in C# looks like this: using System;
class HelloWorld { public static int Main(String[] args) { Console.WriteLine("Hello, World!"); return 0; } }
The first line makes System—the COM+ base class namespace—available to the program. The program class itself is named HelloWorld (code is arranged into classes, not by files). The Main method (which takes arguments) is defined within HelloWorld. The COM+ Console class writes the friendly message, and the program is finished. Of course, you could get fancy. What if you want to reuse the HelloWorld program? Easy—put it into its own namespace! Just wrap it in a namespace and declare the classes as public if you want them accessible outside the particular namespace. (Note here that I've changed the name Main to the more suitable name SayHi.) using System;
namespace MSDNMag { public class HelloWorld { public static int SayHi() { Console.WriteLine("Hello, World!"); return 0; } } }
You can then compile this into a DLL, and include the DLL with any other programs you're building. The calling program could look like this: using System; using MSDNMag;
class CallingMSDNMag { public static void Main(string[] args) { HelloWorld.SayHi(); return 0; } }
One final point about classes. If you have classes with the same name in more than one namespace, C# lets you define aliases for any of them so you don't have to fully qualify them. Here's an example. Suppose you have created a class NS1.NS2. ClassA that looks like this: namespace NS1.NS2
{ class ClassA {} }
You can then create a second namespace, NS3, that derives the class N3.ClassB from NS1.NS2.ClassA like this: namespace NS3
{ class ClassB: NS1.NS2.ClassA {} }
If this construct is too long for you, or if you're going to repeat it several times, you can use the alias A for the class NS1.NS2.ClassA with the using statement like so: namespace NS3
{ using A = NS1.NS2.ClassA; class ClassB: A {} }
This effect can be accomplished at any level of an object hierarchy. For instance, you could also create an alias for NS1.NS2 like this: namespace NS3
{ using C = NS1.NS2; class ClassB: C.A {} }
Modernity
Like coding languages, the needs of programmers evolve over time. What was once revolutionary is now sort of, well, dated. Like that old Toyota Corolla on the neighbor's lawn, C and C++ provide reliable transportation, but lack some of the features that people look for when they kick the tires. This is one of the reasons many developers have tinkered with the Java language over the past few years.
C# goes back to the drawing board and emerges with several features that I longed for in C++. Garbage collection is one example—everything gets cleaned up when it's no longer referenced. However, garbage collection can have a price. It makes problems caused by certain risky behavior (using unsafe casts and stray pointers, for example) far harder to diagnose and potentially more devastating to a program. To compensate for this, C# implements type safety to ensure application stability. Of course, type safety also makes your code more readable, so others on your team can see what you've been up to—you take the bad with the good, I guess. I'll go into this later in this article. C# has a richer intrinsic model for error handling than C++. Have you ever really gotten deep into a coworker's code? It's amazing—there are dozens of unchecked HRESULTs all over the place, and when a call fails, the program always ends up displaying an "Error: There was an error" message. C# improves on this situation by providing integral support for throw, try…catch, and try…finally as language elements. True, you could do this as a macro in C++, but now it's available right out of the box.
Part of a modern language is the ability to actually use it for something. It seems simple enough, but many languages completely ignore the needs for financial and time-based data types. They're too old economy or something. Borrowing from languages like SQL, C# implements built-in support for data types like decimal and string, and lets you implement new primitive types that are as efficient as the existing ones. I'll discuss some of the new support for data types and arrays later in the article.
You'll also be glad to see that C# takes a more modern approach to debugging. The traditional way to write a debuggable program in C++ was to sprinkle it with #ifdefs and indicate that large sections of code would only be executed during the debugging process. You would end up with two implementations—a debug build and a retail build, with some of the calls in the retail build going to functions that do nothing. C# offers the conditional keyword to control program flow based on defined tokens.
Remember the MSDNMag namespace? A single conditional statement can make the SayHi member a debug-only function. using System;
namespace MSDNMag { public class HelloWorld { [conditional("DEBUG")] public static void SayHi() { Console.WriteLine("Hello, World!"); return; } ••• } }
Conditional functions must have void return types (as I've set in this sample). The client program would then have to look like this to get a Hello World message: using System using MSDNMag
#define DEBUG class CallingMSDNMag { public static void Main(string[] args) { HelloWorld.SayHi(); return 0; } }
The code is nice and uncluttered without all those #ifdefs hanging around, waiting to be ignored. Finally, C# is designed to be easy to parse, so vendors can create tools that allow source browsing and two-way code generation.
Object Oriented
Yeah, yeah. C++ is object oriented. Right. I've personally known people who have worked on multiple inheritance for a week, then retired out of frustration to North Carolina to clean hog lagoons. That's why C# ditches multiple inheritance in favor of native support for the COM+ virtual object system. Encapsulation, polymorphism, and inheritance are preserved without all the pain.
C# ditches the entire concept of global functions, variables, and constants. Instead, you can create static class members, making C# code easier to read and less prone to naming conflicts.
And speaking of naming conflicts, have you ever forgotten that you created a class member and redefined it later on in your code? By default, C# methods are nonvirtual, requiring an explicit virtual modifier. It's far harder to accidentally override a method, it's easier to provide correct versioning, and the vtable doesn't grow as quickly. Class members in C# can be defined as private, protected, public, or internal. You retain full control over their encapsulation.
Methods and operators can be overloaded in C#, using a syntax that's a lot easier than the one used by C++. However, you can't overload global operator functions—the overloading is strictly local in scope. The overloading of method F below is an example of what this looks like: interface ITest
{ void F(); // F() void F(int x); // F(int) void F(ref int x); // F(ref int) void F(out int x); // F(out int) void F(int x, int y); // F(int, int) int F(string s); // F(string) int F(int x); // F(int) }
The COM+ component model is supported through the implementation of delegates—the object-oriented equivalent of function pointers in C++.
Interfaces support multiple inheritance. Classes can privately implement internal interfaces through explicit member implementations, without the consumer ever knowing about it.
Type Safety
Although some power users would disagree with me, type safety promotes robust programs. Several features that promote proper code execution (and more robust programs) in Visual Basic have been included in C#. For example, all dynamically allocated objects and arrays are initialized to zero. Although C# doesn't automatically initialize local variables, the compiler will warn you if you use one before you initialize it. When you access an array, it is automatically range checked. Unlike C and C++, you can't overwrite unallocated memory.
In C# you can't create an invalid reference. All casts are required to be safe, and you can't cast between integer and reference types. Garbage collection in C# ensures that you don't leave references dangling around your code. Hand-in-hand with this feature is overflow checking. Arithmetic operations and conversions are not allowed if they overflow the target variable or object. Of course, there are some valid reasons to want a variable to overflow. If you do, you can explicitly disable the checking.
As I've mentioned, the data types supported in C# are somewhat different from what you might be used to in C++. For instance, the char type is 16 bits. Certain useful types, like decimal and string, are built in. Perhaps the biggest difference between C++ and C#, however, is the way C# handles arrays.
C# arrays are managed types, meaning that they hold references, not values, and they're garbage collected. You can declare arrays in several ways, including as multidimensional (rectangular) arrays and as arrays of arrays (jagged). Note in the following examples that the square brackets come after the type, not after the identifier as in some languages. int[ ]
intArray; // A simple array int[ , , ] intArray; // A multidimensional array // of rank 3 (3 dimensions) int[ ][ ] intArray // A jagged array of arrays int[ ][ , , ][ , ] intArray; // A single-dimensional array // of three-dimensional arrays // of two-dimensional arrays
Arrays are actually objects; when you first declare them they don't have a size. For this reason, you must create them after you declare them. Suppose you want an array of size 5. This code will do the trick: int[] intArray = new int[5];
If you do this twice, it automatically reallocates the array. Therefore int[] intArray;
intArray = new int[5]; intArray = new int[10];
results in an array called intArray, which has 10 members. Instantiating a rectangular array is similarly easy: int[,] intArray = new int[3,4];
However, instantiating a jagged array needs a bit more work. You might expect to say new int[3][4], but you really need to say:
int[][] intArray = new int[3][]; For (int a = 0; a < intArray.Length; a++) { intArray[a] = new intArray[4]; }
You can initialize a statement in the same line you create and instantiate it by using curly brackets:
int[] intArray = new int[5] {1, 2, 3, 4, 5};
You can do the same thing with a string-based array: string[] strArray = new string[3] {"MSJ", "MIND","MSDNMag"};
If you mix brackets, you can initialize a multidimensional array: int[,] intArray = new int[3, 2] { {1, 2}, {3, 4}, {5, 6} };
You can also initialize a jagged array: int[][] intArray = new int[][] { new int[] {2,3,4}, new int[] {5,6,7} };
If you leave out the new operator, you can even initialize an array with implicit dimensions: int[] intArray = {1, 2, 3, 4, 5};
Arrays are considered objects in C#, and as such they are handled like objects, not like an addressable stream of bytes. Specifically, arrays are automatically garbage collected, so you don't need to destroy them when you're finished using them. Arrays are based on the C# class System.Array, so you can treat them conceptually like a collection object, using their Length property and looping through each item in the array. If you define intArray as shown earlier, the call intArray.Length
would return 5. The System.Array class also provides ways to copy, sort, and search arrays. C# provides a foreach operator, which operates like its counterpart in Visual Basic, letting you loop through an array. Consider this snippet: int[] intArray = {2, 4, 6, 8, 10, -2, -3, -4, 8};
foreach (int i in intArray) { System.Console.WriteLine(i); }
This code will print each number in intArray on its own line of the system console. The System.Array class also provides a GetLength member function, so the preceding code could also be written like this (remember, arrays are zero-based in C#): for (int i = 0; i < intArray.GetLength(); i++)
{ System.Console.WriteLine(intArray(i)); }
Scalability
C and C++ require all sorts of often-incompatible header files before you can compile all but the simplest code. C# gets rid of these frequently aggravating headers by combining the declaration and definition of types. It also directly imports and emits COM+ metadata, making incremental compiles much easier. When a project gets large enough, you might want to split up your code into smaller source files. C# doesn't have any restrictions about where your source files live or what they're named. When you compile a C# project, you can think of it as concatenating all the source files, then compiling them into one big file. You don't have to track which headers go where, or which routines belong in which source file. This also means that you can move, rename, split, or merge source files without breaking your compile.
Version Support
DLL Hell is a constant problem for users and programmers alike. MSDN® Online has even dedicated a service specifically for users who need to track the different versions of system DLLs. There's nothing a programming language can do to keep a library author from messing around with a published API. However, C# was designed to make versioning far easier by retaining binary compatibility with existing derived classes. When you introduce a new member in a base class as one that exists in a derived class, it doesn't cause an error. However, the designer of the class must indicate whether the method is meant as an override or as a new method that just hides the similar inherited method.
As I've already mentioned, C# works with a namespace model. Classes and interfaces in class libraries must be defined in hierarchical namespaces instead of in a flat model. Applications can explicitly import a single member of a namespace, so there won't be any collisions when multiple namespaces contain similarly named members. When you declare a namespace, subsequent declarations are considered to be part of the same declaration space. Therefore, if your code looks like this namespace MSDNMag.Article
{ class Author { ... } } namespace MSDNMag.Article { class Topic { ... } } you could express the same code like so: namespace MSDNMag.Article { class Author { ... } class Topic { ... } }
Compatibility
Four types of APIs are common on the Windows platform and C# supports all of them. The old-style C APIs have integrated support in C#. Applications can use the N/Direct features of COM+ to call C-style APIs. C# provides transparent access to standard COM and OLE Automation APIs and supports all data types through the COM+ runtime. Most importantly, C# supports the COM+ Common Language Subset specification. If you've exported any entities that aren't accessible from another language, the compiler can optionally flag the code. For instance, a class can't have two members runJob and runjob because a case-insensitive language would choke on the definitions. When you call a DLL export, you need to declare the method, attach a sysimport attribute, and specify any custom marshaling and return value information that overrides the COM+ defaults. The following shows how to write a Hello World program that displays its message of cheer in a standard Windows message box. class HelloWorld
{ [sysimport(dll = "user32.dll")] public static extern int MessageBoxA(int h, string m, string c, int type); public static int Main() { return MessageBoxA(0, "Hello World!", "Caption", 0); } }
Each COM+ type maps to a default native data type, which COM+ uses to marshal values across a native API call. The C# string value maps to the LPSTR type by default, but it can be overridden with marshaling statements like so: using System;
using System.Interop; class HelloWorld { [dllimport("user32.dll")] public static extern int MessageBoxW( int h, [marshal(UnmanagedType.LPWStr)] string m, [marshal(UnmanagedType.LPWStr)] string c, int type); public static int Main() { return MessageBoxW(0, "Hello World!", "Caption", 0); } }
In addition to working with DLL exports, you can work with classic COM objects in several ways: create them with CoCreateInstance, query them for interfaces, and call methods on them. If you want to import a COM class definition for use within your program, you must take two steps. First, you must create a class and use the comimport attribute to mark it as related to a specific GUID. The class you create can't have any base classes or interface lists, nor can it have any members. // declare FilgraphManager as a COM classic coclass [comimport, guid("E436EBB3-524F-11CE-9F53-0020AF0BA770")]
class FilgraphManager { }
After the class is declared in your program, you can create a new instance of it with the new keyword (which is equivalent to the CoCreateInstance function). class MainClass
{ public static void Main() { FilgraphManager f = new FilgraphManager(); } }
You can query interfaces indirectly in C# by attempting to cast an object to a new interface. If the cast fails, it will throw a System.InvalidCastException. If it works, you'll have an object that represents that interface. FilgraphManager graphManager = new FilgraphManager(); IMediaControl mc = (IMediaControl) graphManager; mc.Run(); // If the cast succeeds, this line will work.
Flexibility
It's true that C# and COM+ create a managed, type-safe environment. However, it's also true that some real-world applications need to get to the native code level—either for performance considerations or to use old-style, unmodernized APIs from other programs. I've discussed ways to use APIs and COM components from your C# program. C# lets you declare unsafe classes and methods that contain pointers, structs, and static arrays. These methods won't be type-safe, but they will execute within the managed space so you don't have to marshal boundaries between safe and unsafe code.
These unsafe features are integrated with the COM+ EE and code access security in COM+. This means that a developer can pin an object so that the garbage collector will pass over them when it's doing its work. (Sort of like a mezuzah for your code.) Unsafe code won't be executed outside a fully trusted environment. Programmers can even turn off garbage collection while an unsafe method is executing.