ASPToday
C# Language Reference
http://www.asptoday.com/content.asp?id=1036
 

C# Language Reference

This appendix gives a summary of the definition of the C# language. It is not intended to be read from beginning to end in one go. Rather, it is there to provide a reference for those times when you need to check up some particular bit of C# syntax. Accordingly, although the topics are presented roughly in order, with the most basic topics first and the more advanced topics towards the end of the appendix, you will find that there is no strict progression in the examples. Any example might assume knowledge of any other part of the C# language, other than the point being illustrated.

Note that all the examples here assume that you are an experienced programmer, with a knowledge of the principles of object-oriented programming. For example, we assume that you are familiar with the broad concepts of inheritance and method overrides. If you need to revise any of these concepts, or the details of the .NET framework, then you will be better off reading the chapters in this book that cover those concepts. The appendix covers the C# language itself. It does not deal with the .NET base classes except where certain aspects of language syntax depend on certain base classes. Nor does it cover assemblies, the C# compiler, or the .NET environment in which compiled Intermediate Language code will run.

Note also that this appendix, while intended as a reference, is also designed to be easy to read rather than exhaustive and formal in its coverage of C#. If you require detailed syntax of the most esoteric features of the language or a rigorous, mathematical-style, definition of the syntax, then the appropriate place to look is the C# language specification in the MSDN documentation.

Conventions

Throughout much of the appendix, we try to present real-world samples. To this end, many examples involve classes or methods that we presume have been defined elsewhere. In general, if any class or method is introduced without comment, you can assume that it is intended as an example of a class or method that you might have written. .NET base classes are not generally used to provide samples unless explicitly indicated, with the exception of Console.WriteLine() , which is used to write the contents of its parameters in various formats to the console window.

Also, note that C# classes and structs contain a number of items that can contain code: methods, properties, indexers, constructors, destructors, operator overloads or custom casts. The general term for such members is function . We use the term member to refer to any item, function or data, that is defined in a class or struct.

Where the meaning is clear, we will use "class etc." or simply "class", when our discussion applies equally to classes or structs.

C# Program Syntax

In this section we will cover the main principles of overall C# syntax.

Basic Syntax

Statements in C# are separated by semicolons:

   int X = 3;
Console.WriteLine("X is " + X);

All keywords in C# are completely lowercase

The compiler also ignores all excess white space - including carriage returns, space characters and tab characters, so that for example the following code snippet is completely equivalent to the above.

int X =
3 ; Console.WriteLine("X is "
+ X);

In particular, note that it is quite legitimate for statements to be spread across multiple lines - it is always the semicolon that marks the end of the statement.

Although the first snippet would be regarded as a better programming style because it is easier to read. On the other hand this code would not compile.

      int X = 3 // error - no semicolon
Console.WriteLine("X is " + X)

C# is also case sensitive, so that for example X and x are considered different variables.

   int X = 3;
int x = 6;

Microsoft has provided fairly detailed guidelines for naming conventions (detailed in the MSDN documentation for the .NET framework). Here we will simply note that you are advised to take advantage of the case sensitivity by declaring names that differ only in case only in code that is not visible to outside classes etc., since your code may be called by VB.NET code - and VB.NET is case insensitive. We will also note that the examples in this appendix have wherever possible been designed according to the Microsoft naming guidelines.

Block Statements

C# allows several statements to be grouped together into a single block statement (sometimes also referred to as a compound statement), by enclosing them in curly braces.

   {
int X = 3;
Console.WriteLine("X is " + X);
}

A block statement is treated as if it was a single statement, and may be used anywhere that a single statement would be acceptable. For example, an if statement takes one other statement, which will be conditionally executed depending on whether the if expression evaluates to true . If you wish more than one statement to be conditionally executed, you can simply put all of the single statements into one block statement.

   if (Profit > 500000) // single statement in if block
Console.WriteLine("Wow!!!!!!!!!!");

Or:

   if (Profit > 500000) // block statement in if block
{
string Message = "Wow!!!!!!!!!!";
Console.WriteLine(Message);
}

Braces also serve to define the scope of variables. Any variable declared anywhere in a block statement is valid until the closing brace that ends that block statement.

   if (Profit > 500000) // block statement in if block
{
string Message = "Wow!!!!!!!!!!";
Console.WriteLine(Message);
}
Message = "if block is over"; // WRONG. The variable Message no
// longer exists

Besides their use in marking block statements, curly braces are used to mark the extent of definitions of classes, structs, methods etc.

   class MyClass
{
// definition of MyClass
}

Note that there is no need to add a semicolon after a closing curly brace that is used to make a block statement or scope a definition. However, a trailing semicolon is required if curly braces are used to delimit a list of values for an array:

double [] d = {2.0, 4.0, 6.0};

Comments

C# has three syntaxes for marking comments:

If the character sequence // occurs in any line, then the rest of that line is assumed to be a comment.

   int X = 5; // this is a comment to say that X is a groovy int
int Y = 6;

If the character sequence /* occurs anywhere, then the compiler assumes that everything that follows is a comment until it discovers a */ sequence. The comment may last for part of a line or for several lines

   /* The following statement ought to tell us how much money we will make
from UnitsSold units, assuming costs remain as specified earlier.
Not sure if this is working so we temporarily substitute 100 to see
what happens */
decimal Profit = EstimateProfitFromUnitsSold( /* UnitsSold */ 100);

Note that it is not possible to nest /* */ comments. The compiler will assume the comment ends the moment it encounters a */ sequence, no matter how many initial /* it encountered.

If the character sequence /// occurs anywhere, then the rest of that line is assumed to be a comment. However, this is also regarded as a special comment from which XML documentation will be automatically generated, if the option to generate documentation is specified with the compiler options.

Note that if any of the symbols /* , // , and /// occur inside a string constant (in other words, after a " ), then they will be interpreted as part of the string, not as a comment marker.

   string Message = "/* is one of the symbols used for comments";
Console.WriteLine(Message); // this line will execute normally

Program Structure

A C# program can be considered informally to consist of the following items: Namespaces - labels that can be used to provide extended names to classes and so avoid naming conflicts. Types - Definitions of data types, including classes, structs, enumerations, and delegates. Variables - declarations of instances of the data types. Executable statements - instructions that will be executed when the program is run (as opposed to generic statements, which might contain instructions or might simply declare variables etc.) Functions - declarations of sequences of statements, grouped into a callable entity, including methods, properties, indexers, constructors, destructors, operator overloads and casts. C# programs also contain interfaces and preprocessor directives. Interfaces are definitions of contracts that define the functions a class guarantees to implement. Preprocessor directives are commands that are formed using a special syntax and which generally inform the compiler how the code should be compiled. However, since these are compiler directives and do not actually exist in the compiled code, we won't consider them in this section - they are covered later in the appendix. Most preprocessor directives can be placed anywhere in your code. C# has strict rules for which items in your code can be contained in which other items. The situation is shown in the diagram, which indicate what items can be contained in what other items. The arrows in this diagram lead from the contained item to the containing item. All items that exist in the compiled code should be contained in a namespace. This is the only rule that is not rigidly enforced. It is legal C# code to define types that are not in a namespace, but this is not recommended. In almost all cases, all your types and other items will be in at least one namespace.


class MyClass { /*etc.*/} // OK but not advised - ought to be in a
// namespace

namespace Wrox.ProfessionalCSharp.AppendixA
{
// all other items must go in here
}

Because declaring types outside namespaces is not recommended, we have not shown this situation in the above diagram. Within a namespace, the only top level items that can be defined are definitions of classes, structs, enumerations, and delegates, interfaces or other namespaces. In particular, it is not permitted to instantiate a variable of any type outside the definition of a class.


namespace Wrox.ProfessionalCSharp.AppendixA
{
int Y; // WRONG - must be inside a data type definition

class MyClass
{
public int X; // OK
}
}

This rule is because C# is designed to encourage an object-programming style, as opposed to the procedural style of languages such as Visual Basic. The idea is that everything in C# is an object or part of an object.

Definitions of data types (classes etc) may also occur inside definitions of other classes or structs. However, data types may not be defined inside methods.

// inside a namespace definition
class MyClass
{
public enum Shape {Square, Circle, Pentagon}; // OK
public void DoSomething()
{
enum HouseType {Terrace, EndTerrace, Detached}; // WRONG! Declared
// inside a method
}
}

Variables may be instantiated either inside class definitions (member fields) or inside method definitions (local variables). They may not be instantiated anywhere else.

// inside a namespace definition
class MyClass
{
private int X ; // OK - member field

public void DoSomething()
{
int X; // OK - local variable
}
}

Note that it is perfectly acceptable - as in the above example shows - to declare a variable in a method that has the same name as a field in the containing class. In this case the local variable hides the field, and if you need to access the field inside that method, you must refer to it as this.X .

class Customer
{
private string name;

public Customer(string name)
{
this.name = name;
}

On the other hand, it does appear to be an error to declare a variable inside a block statement that would hide another variable of that name in the same method:

   void DoSomething()
{
int X; // OK - local variable
{
int X; // WRONG - hides the other X variable
}
}

Obviously, the only place in which statements, other than definitions of data types and declarations of variables, can occur is inside method definitions.

Namespaces are normally the topmost level in your code. Everything else is defined inside a namespace. The only thing that can contain a namespace is an other name space.

namespace Wrox
{
namespace ProfessionalCSharp.AppendixA
{
// class definitions etc.
}
}

The only statements that should normally occur outside a namespace if you are following good programming principles are

  • Preprocessor directives - since these do not actually compile to any items in the compiled assembly, but simply affect the way the compiler does its work
  • The using statement. This statement has the syntax of a normal statement, but has a similar effect to a preprocessor directive in that all it does is modify how the compiler recognizes symbols in your code as it is compiling. It doesn't cause any items in the compiled code to be created.

#define DEBUG // OK
using System; // OK

namespace Wrox
{

Identifers

Identifiers (that is to say, names of types, variables, and name spaces) must, with a couple of exceptions, conform to the Unicode 3.0 standard. For practical purposes this means that they must begin with a capital or lower case letter ( A-Z , a-z, or a valid letter in another language, for example the letters æ, ø, or å used in some Scandinavian languages) or the underscore character ( _ ), and any of the digits ( 0-9 ). If you wish to use a Unicode character from outside the Latin alphabet in an identifier, you can do so using the syntax \uxxxx where xxxx is the hexadecimal representation of the Unicode character in question. For simplicity we'll demonstrate this with an example that does come from the Latin alphabet: The Unicode number for the letter 'e' is 0x0065. Hence the following two lines of code are completely equivalent:

   int Result = 20;

   int R\u0065sult = 20;

You cannot use any C# keyword as the name of any item - for obvious reasons - unless you prefix it with the character @ to indicate that it is to be interpreted as an identifier rather than a keyword. If do this, the @ character does not form part of the identifier as it appears in the compiled assembly. A full list of C# keywords to avoid are listed at the end of this appendix. Although it does not count as an error, you are also strongly advised not to use any Visual Basic.NET keywords as names anywhere where the name may be visible to other assemblies - as this could cause problems for any developer who wants to access your assembly from Visual Basic.NET code. The Visual Basic .NET keywords are listed in Chapter 8.

class foreach // WRONG - foreach is a keyword
{
public int Inherits; // Correct but might cause confusion as Inherits is
//a keyword in VB.NET
public int X0_B; // OK
public @if; // also OK
}

Statements

All statements (pieces of code that can be executed) in C# consist of one of the following:

  1. A call to a method, property or constructor, possibly involving an assignment

NewAccount = Account.Open(NewCustomerName, AccountType, Branch);

  1. An expression:

Balance *= (1.0 + MonthlyInterestRate);
++MonthsAccountOpen;

Note that expressions are only permitted if they do assign a value to something. Do nothing statements are not valid C#.

Y+1; // WRONG - doesn't actually do anything that can have any
// effect in the code

  1. A variable declaration:
  2. int X;

  1. One of the C# commands that controls execution flow

if (X > 5)
DoSomething();

  1. A block statement consisting of several simple statements enclosed by braces. Note that a block statement may itself contain further block statements (nested block statements). There is no limit to the level of nesting (other than that beyond 3 or 4 levels, your code can start to get hard to read).

// this is all one single block statement
{
foreach (Customer NextCustomer in Customers)
{
if (NextCustomer.Balance < 0)
{ // this starts another block statement - this one is
// conditionally executed depending on the if condition.
NextCustomer.SendReminderLetter();
++BalancesOwing;
}
if (NextCustomer.Balance < -1000)
DebtorsList.Add(NextCustomer);
}
}

Namespaces

Namespaces provide an additional means of identifying a type (class, struct, enumeration or delegate), since the full name of any class etc. includes the names of any containing namespaces.

A namespace is defined using the keyword namespace , and its scope is indicated by braces.

namespace Wrox
{
// all class definitions etc. here
}

It is possible to nest namespaces inside each other:

namespace Wrox
{
namespace ProfessionalCSharp
{
// all class definitions etc. here
}
}

Alternatively, you can declare nested namespaces with one single namespace command. The following code has exactly the same effect as the previous version.

namespace Wrox.ProfessionalCSharp
{
// all class definitions etc. here
}

If we now declare the class inside the above nested namespace:

namespace Wrox.ProfessionalCSharp
{
class ChapterOutline
{
//implementation
}
}

Then the full name of this class is actually not just ChapterOutline , but Wrox.ProfessionalCSharp.ChapterOutline . If this class is referred to from outside the Wrox.ProfessionalCSharp namespace, are then it must be referred to by its full name. If you wish to refer to a sub-namespace from inside a parent namespace, then you only need give the name of the sub-namespace. For example, if some code in the Wrox namespace wanted to refer to the ChapterOutline class, it would refer to it as ProfessionalCSharp.ChapterOutline .

Note that the above example illustrates the recommended convention for namespace names: You should start with the name of your company, then nest namespaces that give the name of the technology or product that the software is part of.

The Using Statement

Clearly, continually referring to classes with their full names is unwieldy, so C# provides the using statement to allow you to use just the type name itself to identify a class etc. The using statement should come at the start of your source code, and it always names one namespace.

using System.IO;

The using statement indicates that any item in the System.IO namespace will be referred to in your code by the name of the data type only. You can supply as many using statements as you want, and the compiler will then examine the data types in all the specified namespaces in order to attempt to resolve any references to data types that are not defined in the current namespace and do not have a namespace indicated explicitly. For example, System.IO is one of the .NET base class namespaces, and contains a class, FileInfo . If the above statement is placed in your code, then you can write:

FileInfo ReadMeFile = new FileInfo(@"C:\My Documents\ReadMe.txt");

whereas otherwise you would have had to write:

System.IO.FileInfo ReadMeFile =
new System.IO.FileInfo(@"C:\My Documents\ReadMe.txt");

Note that when used in this way, the using statement only applies to the namespace explicitly named - not to any nested namespaces. For example, the statement:

using System;

allows the identifiers in the System namespace to be referred to by the type name only. However, if you want to refer to System.IO.File , as File you will still need to separately supply the statement:

 using System.IO;

You can also use the using statement to provide an abbreviated name for just one type name. This can be useful to distinguish two data types that have the same name but are in different namespaces. For example, suppose you needed to use the System.IO namespace, but you also needed to use a different, third-party-supplied class, FileInfo , that is in another namespace - say ThirdPartyCompany.FileSoftware.FileInfo . Simply adding using statements to access both namespaces would cause confusion because then occurrences of FileInfo could refer to either class. Instead, you could indicate you wish to use just the System.IO namespace then provide a different name for the ThirdPartyCompany.FileSoftware.FileInfo class:

using System.IO;
using TPCFileInfo as ThirdPartyCompany.FileSoftware.FileInfo;

With the above code, FileInfo will now be taken to mean System.IO.FileInfo (and the same for any other classes in System.IO ), while any references to TPCFileInfo will be interpreted by the compiler as indicating ThirdPartyCompany.FileSoftware.FileInfo .

Data types and Variables

Besides data types that you define yourself as classes, structs, enumerations, or delegates, C# recognizes the following data types, and reserves a keyword to represent for each of these types. Note that all data types in the .NET environment are derived from the class System.Object , and are therefore formally objects. Hence the C# keywords in this table are, strictly speaking, simply syntactical conveniences provided by the C# language to make it easier for you to use the relevant base class.

Keyword that defines this typeValue or Reference TypeBase class this actually representsDescription
boolvalueSystem.Booleanvalue of true or false
bytevalueSystem.Byte8-bit unsigned integer
sbytevalueSystem.SByte8-bit signed integer
shortvalueSystem.Int1616-bit signed integer
ushortvalueSystem.UInt1616-bit unsigned integer
intvalueSystem.Int3232-bit signed integer
uintvalueSystem.UInt3232-bit unsigned integer
longvalueSystem.Int6464-bit signed integer
ulongvalueSystem.UInt6464-bit unsigned integer
decimalvalueSystem.Decimalfloating point value stored to approx 28 significant figures. Has greater accuracy but smaller range then float and double.
floatvalueSystem.Single32-bit floating point value
doublevalueSystem.Double64-bit floating point value
charvalueSystem.Char16-bit value intended to represent a Unicode character
stringreferenceSystem.Stringtext (set of Unicode characters)
objectreferenceSystem.Objectgeneric type that can refer to any variable of any type

In addition, C# makes available four other keywords that allow you to define other objects that are derived from base class objects.

Keyword that defines this typeValue or ReferenceBase class inherited fromDescription
classreferenceSystem.Object (unless another class specified)Any reference type
structvalueSystem.ValueTypeAny value type
delegatereferenceSystem.Delegate or System.MulticastDelegateAny type that holds details of a method reference
enumvalueSystem.EnumAny type that holds a set of enumerated values

C# also defines the keyword event , which refers to a particular definition of a delegate.

Value and Reference Types

C# rigorously enforces rules about which data types are valued types (stored either in an area of memory known as the stack , or inline if embedded in another type) and which are reference types. All instances of reference types are stored in another area of memory known as the heap , with that the address of the object usually being stored in the stack. It is not possible to specify that a particular instance of a class is to be stored in a particular way as it is in some programming languages - since how a particular variable is to be stored is determined entirely by its data type.

The complete list of value and reference types is given in the above tables.

If a data type is a value type, then each variable of that type contains an instance of that type.

int X = 10; //X contains the value 10

On the other hand, if a data type is a reference type, then each variable of that type simply stores a reference to where in memory the instance is stored. We use the base class System.Drawing.Graphics to illustrate this.

Graphics dc = new Graphics(); // dc only contains a reference
Graphics dc2 = dc; // Graphics instance has not been copied.
// Only the address to it has been copied

The instance that a reference type refers to is known as the referent .

As noted elsewhere, the referent can be any instance of the class the reference was declared as, or it can be an instance of any class derived from that class.

It is not possible to access the referent directly in C# - it can only be accessed through the reference.

All variables, whether value or reference, obey the usual scope rules (fields are scoped to all methods in the class or class instance, local variables to the closing brace that ends the method or block statement in which they are defined). However, in the case of reference variables, the referent remains on the heap until the garbage collector removes it. The garbage collector will periodically remove any objects from the heap that are not referred to by any reference variables. Hence, you can extend the lifetime of an object as long as you want by ensuring that the program holds a reference to it somewhere.

It is also possible for a reference to refer to the so-called null reference . This is represented by the keyword null , and indicates that the reference does not refer to anything. If you attempt to call any methods against a null reference, this will cause an exception to be thrown at runtime.

dc = null; // dc now holds the null reference

Declaring Variables

Variables are declared using the following syntax:

int X;
MyClass Mine;
double D1, D2;

The above code does not however initialize at the values of the variables. C# has two different approaches to initialization according to the type of data:

  • Variables declared as class or struct member fields are automatically initialized by being zeroed out. This means that numeric types will contain 0, bool s will contain false , and references will contain the null reference. You can modify this behavior by specifying an initial value in the definition of the containing class or struct, or by writing a value in the constructor to this class or struct.
  • Variables declared locally to a method are completely uninitialized, and it is a syntax error to access any of their values before they are explicitly set.

struct SomeOtherStruct
{
int X; // will contain 0 initially
MyClass Mine; // will be a null reference initially

void DoSomething()
{
int Y; // uninitialized
int Z = Y + 2; // ERROR - Y referenced before being set

All variables of whatever data type are initialized by calling a constructor. For classes and structs that you have defined, the syntax for doing so is using new keyword followed by any parameters in parentheses.

MyClass Object = new MyClass();

For predefined data types, this syntax is available to some extent:

int X = new int(); // X will contain zero
string TwentyDs = new string('D', 20); // creates the string
// "DDDDDDDDDDDDDDDDDDDD";

However in most cases you will use an alternative simpler syntax for predefined types, in which the initial value is specified directly.

int Y = 50;
string Message = "Hello World";

This simpler syntax is not available for structs and classes that are defined in your code.

You can initialize the variables when they are declared, or use an assignment statement to specify an initial value later.

int Y;
Y = 50;
MyClass MyObject;
MyObject = new MyClass();

C# recognizes both syntaxes for arrays. Arrays are covered later in this appendix.

For structs only (not classes) you can alternatively initialize a struct by separately initializing all its fields (assuming you have public access to them).

For example, if this struct is defined:

struct ComplexNumber
{
public double X;
public double Y;
}

Then after executing the following code, WaveVector will be considered to be initialized.

ComplexNumber WaveVector;
WaveVector.X = 20.0;
WaveVector.Y = 34.5;

Note that calling the constructor for a reference type involves allocating memory in the heap, while calling constructor on any value type will simply cause the appropriate values to be placed in the variable - this memory will already have been allocated.

Assigning Values to Variables

The syntax for this is very simple, as it follows the shorthand syntax for constructors of predefined types.

X = -10; // X declared as an int
D1 = 4.0; // D1 declared as a double
Message = "Hello" // Message declared as a string

For user-defined classes and structs, you must either assign values to the individual fields and properties, or use a constructor.

MyObject = new MyClass();

Alternatively, for both predefined data types and for classes etc. you can syntactically set a variable equal to the return value of a method that returns an appropriate type:

X = GetAnInt(); // GetAnInt defined elsewhere
Mine = GetAMyClassRef(); // Mine of type MyClass, GetAMyClassRef()
// defined elsewhere

Syntax for Different Types of Data

If assigning constant numbers to variables, you may use the following suffixes to indicate that the number should be interpreted a particular data type: L : long , M : decimal , F : float , UL : unsigned long . These suffixes are all case-insensitive, except that a compiler warning will be generated if you use lower-case l for long , because of the risk of confusing it with the numeral 1 (one).

float GravityAcceleration = 9.81f;
decimal Overdraft = 140.0M;

C# defines a detailed set of rules for precisely when these suffixes are required, but in general you should use one if there is any doubt.

String expressions are always indicated by enclosing the string in double quotes. Single quotes should not be used for strings, as they indicate individual characters

string Message = "Hello";
char C = 'H';

Certain special characters are however represented by escape sequences:

Escape sequenceCharacter nameUnicode encoding
\'Single quote0x0027
\"Double quote0x0022
\\Backslash0x005C
\0Null0x0000
\aAlert0x0007
\bBackspace0x0008
\fForm feed0x000C
\nNew line0x000A
\rCarriage return0x000D
\tHorizontal tab0x0009
\vVertical tab0x000B

For example, to represent the string C:\My Documents\Chapter 6.doc , you would write:

string Path = "C:\\My Documents\\Chapter 6.doc";

While to represent this text:

First line

Second line

You could write

string Text = "First line\nSecond line";

If you prefer not to use escape sequences in a particular string, you can instruct the compiler to interpret the complete string literally by prefixing it with an @ sign:

string Path = @"C:\My Documents\Chapter 6.doc";

It is not possible to represent the other text, containing a carriage return, using literal strings.

As remarked above, characters are represented in the same way as strings, but single quotes instead of double quotes are used.

char Ecks = 'X', BackSlash = '\\';

The @ format is valid only for strings. Individual characters must be escaped.

Alternatively, for characters, you can supply the numerical value of the character. Hence, the following is exactly equivalent to the above code.

char Ecks = (char)88, BackSlash = (char)92;

Strictly, what is happening here is that an integer value is being converted (cast) to a char ; this will be explained very shortly.

Constants

A variable can be declared as const , which effectively means that it is a constant named value, and cannot be assigned to. If a variable is marked as const , then its initial value must be specified when it is declared.

   const int MaxLength = 10;

Note that, unlike for some other languages, constants in C# are generally not written in uppercase. Microsoft decided to recommend Pascal casing (in which only the first letter of each word making up the name) is uppercase because Pascal-cased names are easier to read than uppercase names.

There are additional requirements for const variables that are declared as member fields. These are detailed in the 'Fields' section of this appendix.

Casts

It is possible to convert values from one data type to another using casts.

int X = 10;
long XX = X; // implicit cast from int to long
int XXX = (int)XX; // explicit cast from long to int

C# formally defines two types of cast, both of which are illustrated in the above code. An implicit cast simply involves syntactically assigning the source variable to the destination variable, whereas an explicit cast involves prefixing the expression to be cast with the name of the destination data type, in parentheses. The idea is that an implicit cast is used when nothing can go wrong - there is no risk of data loss or of an exception being thrown. The conversion from int to long is therefore defined as being implicit, because any value that can be stored in an int can also be stored in a long . On the other hand, an explicit cast is used if there is a possibility of the cast failing. For example, conversion from long to int must be explicit because of the risk of an overflow if the long holds a value that is too large to be stored in an int . For the particular case of long -to- int conversion, as well as most explicit conversions between other numeric predefined data types, the results that occur if a number is out of range depend on whether the conversion occurs inside a checked block (see below). By specifying a cast explicitly in your code, you are in effect telling the compiler that you understand there is a risk of a problem arising from the conversion, and you want to do it anyway.

If a cast is defined as implicit, then it is syntactically correct to perform the cast either implicitly or explicitly. On the other hand, if a cast is defined as explicit, an attempt to perform it implicitly will raise a compilation error.

int FirstInteger = 10;
long FirstLong = (long)FirstInteger; // OK, although int-to-long is an
// implicit cast
int SecondInteger = FirstLong; // WRONG - explicit cast required
// long-to-int

For the predefined data types, the C# language specification provides a full list of which casts are explicit and which are implicit. There are two many different possible combinations of source and destination data type to list all the casts here, but the situation can be summarized by saying that the only implicit casts allowed are those for which nothing can go wrong -for example, short to int , ushort to long etc. If converting from any floating-point type to any integer type then you need an explicit cast (because of the risk of a fractional part of the number getting lost). Similarly, if converting from signed to unsigned (in cast the value is negative), and converting to any data type whose maximum value is lower than that of the source data type. However that implicit casting from int to float and from long to float or double is permitted, although there may be a loss of precision associated with these casts.

With these provisos, you can convert from any predefined numeric type to any other predefined numeric type. Note however, that char is really viewed in this context as holding a character rather than a number. Hence all conversions between char and numeric types must be explicit even if there is no risk of data loss.

If an implicit cast is allowed, you may still choose to cast explicitly if it makes it clearer for other developers to follow the logic of your code. This is particularly true when dealing with method overloads and user-defined casts (covered later in the appendix).

Converting To or From string

It is quite common that you need to convert between a numeric data type and a string. Most frequently, this is either so that you can display the value of or read a value into a numeric type.

C# does not provide any predefined casts from the numeric types to string . However, all data types implement a virtual method, ToString() , which returns a string representation of the data type. This method is inherited by all classes from System.Object , and is overridden in the predefined numeric data types to return a string representation of the value (as opposed to the data type). Hence you can write:

int X = 10;
string ResultText = X.ToString();

For performing the reverse operation, all the predefined numeric datatypes implement a static method, Parse() , which has a number of overloads, one of which takes a string as a parameter.

string ResultText = "10";
int X = int.Parse(ResultText);

Parse also has a number of other overloads for most predefined data types that allow you to specify for example globalization information concerning the format of the string to be converted.

Checked Code

Usually, if an overflow occurs as the result of an arithmetic operation or an explicit cast, no action is taken: The excess bits of the result are lost and execution continues with a variable having a (presumably) incorrect value:

byte Byte1 = (byte)128;
byte Byte2 = (byte)130;
byte Byte3 = (byte)(Byte1 + Byte2); // result is 2

In this case, the result of the addition is 258, but since byte can only store values between 0 and 255, an overflow occurs and we end up with 2 instead.

If you prefer, you can mark a block of code as checked . Any arithmetic operations or explicit casts involving the predefined data types that occur inside a checked block are examined to see if an overflow occurred, and is it does, an OverflowException is thrown.

checked
{
byte Byte1 = (byte)128;
byte Byte2 = (byte)130;
byte Byte3 = (byte)(Byte1 + Byte2); // this will throw an exception
}

Note that checked blocks are scoped at a source code level. In other words, the checked block only applies to the text that is physically inside the block. If you call a method from within a checked block, then the method executes as unchecked unless the code in it is explicitly marked as checked too.

checked
{
// anything here will execute checked
DoSomething();
DoSomethingElse();
}

void DoSomething();
{
// code here will execute unchecked even though it is
// called from inside a checked block
}

void DoSomethingElse
{
checked
{
// code here will execute checked
}
}

Boxing

Boxing is the technical term for the process of casting any value type to object (in other words, a System.Object instance). The cast is always valid because any type, even value types, are derived from Object . However, it's important to be aware that the cast involves taking a copy of the value type, since value types are stored inline or on the stack, whereas Object is a reference type and so must be stored on the heap. The process of taking this copy will not be that significant if we are dealing with a primitive data type, but may give a significant performance hit if the value type is a large struct. All the normal rules of casts still apply for boxing processes.

When carrying out the boxing cast, you get an object reference, but this reference points to an instance on the heap that retains the information about its original data type, and so can be cast back. Syntactically, there's no difference between this cast and any other cast from base type to derived type - it's just that the cast involves copying data back from the heap. (Casting between base and derived types is covered in the section on inheritance.)

int X = 20;
object Value;
Value = X; // boxing - implicit cast
int Y = (int) Value; // unboxing - explicit cast

Boxing occurs implicitly when value types are passed as parameters to any method that expects a reference type. In this situation you may prefer to indicate the boxing explicitly in order to make the code more self documenting. You may also prefer to box explicitly for performance reasons. For example, in the following code, the Console.WriteLine() overload calls actually takes an object as the second and third parameter. Boxing the value RunNumber explicitly means it is only boxed once, instead of being boxed separately every time Console.WriteLine() is called.

int RunNumber = GetRunNumber(); // assume this returns an int
object oRunNumber = (object)RunNumber;
for (int i=0 ; i< 10 ; i++)
{
// assume Result is an array of some numeric type
Console.WriteLine("Run {0} Iteration {1} Gives {2}", oRunNumber, i,
Result[i]);
}

Note that in the above code there is no performance advantage to be gained by explicitly boxing i or Result[i] since each of these is only passed to Console.WriteLine() once before being modified.

typeof and GetType()

C# provides the typeof keyword, which allows you to obtain information concerning different data types. It cannot be used to obtain information about a particular variable, as it takes as its parameter the name of a data type rather than the name of a variable. It returns an instance of the base class System.Type which holds information about a given type.

Type IntTypeInfo = typeof (int);
string TypeName = IntTypeInfo.Name; // TypeName will contain "Int32"

In this example, we have used the Name property of the Type class to extract the name of the type. This by itself may not seem too useful, but the Type class implements a number of other properties that supply other extended information about a type. For example, you can find out what methods and. a type implements:

Type ti = typeof(int);
foreach(System.Reflection.MethodInfo mi in ti.GetMethods())
{
Console.WriteLine(mi.Name);
}

We won't go into full details of the TypeInfo and related classes here - details are in MSDN. TypeInfo itself is in the System namespace, and most of the related classes are in System.Reflection .

If you wish to obtain information about a type, given a particular variable whose type you are not certain of, you can use the GetType() method, which is implemented by all objects (it is inherited from System.Object ). This method returns a System.Type object that describes the variable (or in the case of a reference variable, the object to which the variable refers, which may be a derived type of the type that the variable was declared as).

object SomeObject;
// initialize SomeObject.
//We now don't know what type / it is actually pointing to
Type TypeInfo = SomeObject.GetType();
Console.WriteLine(TypeInfo.Name);

as and is

C# provides two more operators which assist with casting when you are unsure if the source instance is of the correct data type:

  • as allows you to attempt the cast. If the cast fails because no cast exists between the source and destination data types, it returns the null reference instead of throwing an exception.
  • is allows you to test whether an object is of a particular data type (or of a type derived from that type).

The following code illustrates both operators

int X = 5;
object obj = X;
Console.WriteLine(obj is int); // displays true
Console.WriteLine(obj is short); // displays false
Console.WriteLine(X is object); // displays true
MessageBox Y = obj as MessageBox; // will fail since Y isn't a MessageBox
Console.WriteLine(Y==null); // displays true
Console.ReadLine();

In this code we use the base class, System.Windows.Forms.MessageBox as an example. The is operator takes two parameters: the name of a variable to be tested, and the name of a class or struct to test it against. It returns the bool value true if the variable is of the given type or of a type derived from it (which means it will definitely be possible to cast it to that type), and false otherwise. Hence, in the above code (obj is int) returns true because obj was created by boxing an int . X is object also returns true because System.Int32 is derived from System.Object , as are all data types. (In fact <anything> is object will always return true since all types are ultimately derived from System.Object ).

Next in the code we attempt to cast obj to a MessageBox reference. This cast will fail because obj actually refers to an instance of int , not MessageBox . If we'd written MessageBox Y = (MessageBox) obj ; then the code would have compiled but thrown an exception at runtime. By using as we simply return the null reference instead.

Note that, whereas you can use the is operator to test against any data type, it is only possible to use the as operator to attempt to cast to reference types. It is a syntax error to use as to cast to value types. The result has to be a reference because otherwise it would not be possible to use the null reference as a failure indicator.

Execution Flow

C# uses the following statements or sets of statements to control the flow of execution.

  • if ... else if ... else
  • while
  • do ... while
  • for
  • foreach
  • goto
  • switch ... case ... goto ... default
  • break
  • continue
  • return
  • try ... catch ... finally

We'll examine each of these statements individually in this section, with the exception of try ... catch ... finally , which we'll cover in the section on exceptions.

First a couple of general points:

Most of these statements involve an expression that appears in round brackets being evaluated one or more times in order to determine where the flow of execution should go. The evaluation of this expression may or may not have some side effects. For example:

while ( (X = GetAnInt() )> 3)
DoSomething();

On each iteration round the loop, GetAnInt() will be called - this may have some side effects of its own, as well as the fact that evaluating the expression in the brackets will cause a value to be assigned to X .

The syntax and function of these statements is as follows

if

The if statement in its most basic form looks like this.

if (Temperature < 10)
Heater.SwitchOn();

The statement inside the if block may be a simple statement as above, or a block statement. The same applies for all the execution-flow commands:

if (Temperature > 20)
{
Fan.SwitchOn();
Console.WriteLine("Fan switched on");
}

The if statement causes the evaluation of the expression that appears in the parentheses after the if keyword. This expression must evaluate to a bool - it is a syntax error for it to evaluate to anything else. If this expression returns true then the statement or block statement inside the if block - that is, the statement immediately following the closing round bracket - is executed. Otherwise, this statement is not executed. Either way, control subsequently transfers to the next statement following the if . Notice that there is no semicolon between the closing round brackets and the statement inside the block.

Any number of else if clauses may optionally be appended to an if statement. If the if() expression returns false , then each expression associated with an else if clause in is evaluated in turn until one of the expressions evaluates to true . At this point, the statement immediately following that expression is executed. After this none of the remaining test expressions are evaluated, but instead control transfers to the statements following the if statement.

if (Temperature < 10)
Heater.SwitchOn();
else if (Temperature > 20)
{
Fan.SwitchOn();
Console.WriteLine("Fan switched on");
}

An if statement may also be optionally terminated with an else clause. The statement inside the else clause will be executed only if none of the conditions associated with the if or else if clauses are evaluated to true.

if (Temperature < 10)
Heater.SwitchOn();
else if (Temperature > 20)
{
Fan.SwitchOn();
Console.WriteLine("Fan switched on");
}
else if (Temperature < 15)
{
Heater.Initialize();
}
else
Console.WriteLine("Nothing to do. Temperature between 15 and 20");

while

Like the if statement, the while statement consists of the keyword followed by an expression in parentheses. Immediately following the closing round brackets is a statement or block statement.

The following uses while to read a file. The StreamReader class is, like the Console class, one of the .NET base classes.

StreamReader sr = new StreamReader(@"C:\ReadMe.txt");
string NextLineOfFile;
while ( (NextLineOfFile = sr.ReadLine()) != null)
Console.WriteLine(NextLineOfFile);
sr.Close();

When execution reaches a while statement, the test expression is evaluated. If it evaluates to true then the statement inside the while is executed and the test expression is evaluated again. The cycle repeats until the test expression evaluates to false , at which point execution continues from the statement following the while statement.

do

The syntax for do looks like this:

StreamReader sr = new StreamReader(@"C:\ReadMe.txt");
string NextLineOfFile;
do
{
NextLineOfFile = sr.ReadLine();
if (NextLineOfFile != null)
Console.WriteLine(NextLineOfFile);

}
while ( NextLineOfFile != null);
sr.Close();

The procedure when executing a do statement is almost identical to that for a while statement. The only difference is that with a do statement the contained loop code is executed first, and then the condition is evaluated. This means that the code inside a do loop is always executed at least once, while that inside a while loop might not be executed at all. The above code achieves exactly the same effect as the previous example, for the while loop, again using the .NET base class, StreamReader .

for

The for loop is the most complex of all loops in C#. Its syntax looks like this.

for (int I=0 ; I<Customers.Length ; I++)
Console.WriteLine(Customer[I].Name);

We see that, unlike the other loops, the for loop has three expressions/statements inside the round brackets that defines the loop, separated by semicolons. Of these, the second item must be an expression that evaluates to a bool . The first and the third items are complete statements. The rule is this:

When the execution flow hits a for loop, the first statement in the parentheses is immediately executed. If, as is typically the case, this statement is simply a declaration and initialization of a variable, then that variable is scoped to the statement in the for loop. Now the loop actually begins. Each iteration of the loop consists of the following:

  1. The expression that forms the second item inside the parentheses is evaluated. If it evaluates to false then the loop terminates instantly, and execution is immediately transferred to the statement following the for loop. Otherwise, the loop continues with the next step.
  2. The statement or block statement inside the for loop is executed.
  3. The statement that forms the third item in the parentheses that define the for loop is executed.

The usual use of the for loop is to provide a counter: The loop is executed as a variable is incremented or otherwise modified, and the loop terminates when this variable hits a certain value or exceeds a range. The above sample will display the strings given in a property of some array elements, Customer[I].Name , for all elements with indices between 0 and 9.

foreach

The syntax of the foreach loop looks like this.

 foreach(SomeClass Instance in SomeCollection)
DoSomething();

Unlike most execution flow statements, which use a C# expression to control any loop or condition, foreach features a special syntax inside the parentheses, which only occurs in this statement. The special syntax begins with a declaration (but with no initialization) of a variable (the loop control variable). This is followed by the in keyword, followed by the name of an existing variable. The existing variable must be of a type that the compiler recognizes as a collection. A collection can be thought of, roughly speaking, as a set of objects, and is in this sense similar to an array. However, whereas elements of an array are retrieved via their index, elements of a collection are retrieved by moving through the collection with a special type of object called an enumerator. One other notable feature of collections is that they provide read-only access to the elements retrieved. More precisely, the compiler will recognize any class or struct as a collection if it implements the IEnumerable interface. Of the predefined data types, only arrays satisfy this requirement, but a number of .NET base classes do so, and you are free to implement your own collections as well. (For an example of this, check out the VectorAsCollection example in Chapter 7) It is a compile-time error to use a foreach loop where the variable following the in keyword is not of a type that implements IEnumerable. The action taken when the code hits a foreach loop is as follows. On entering the loop, the IEnumerable.GetEnumerator() method is called against the collection variable to return an enumerator, which is expected to implement the IEnumerator interface. For each iteration round the loop, the IEnumerator.MoveNext() method is called against the enumerator to shift to the next item in it, followed by the IEnumerator.Current property, and the result is assigned to the loop control variable. If this return value is null then the loop terminates and execution continues at the next statement. Otherwise, the statement inside the loop is executed with the loop control variable having its new value, after which the loop iterates again. Note that the IEnumerator.Current property returns an object reference. This must be cast to the data type of the loop control variable. Obviously, if this cast fails, an exception will be thrown. If a foreach loop is applied to an array, the effect will be to step through all the values in the array. goto The goto statement simply transfers control to a named statement, which is identified by a label. Labels can be recognized because they end with a colon: if (X = true) goto NextRound; // later in code NextRound: // code to carry on There are some restrictions on the relative locations of the goto statement and the labeled statement. In particular, they must both be in the same method, and it is not permitted to jump into a block statement or into a loop. (Though it is possible to jump out of a block statement). if (X = true) goto NextRound; // WRONG. Jumping into a block statement. { // later in code NextRound:

Use of goto statements is strongly discouraged, except for the particular case of jumping between case clauses of a switch statement, since it can lead to code that is hard to understand.

switch ... case ... break

The switch statement provides an alternative syntax to if for executing different conditional branches. It is suited to the situation in which there are a number of different actions to be taken depending on the value of a particular variable. The syntax looks like this.

string UserInput = "W";
switch (UserInput)
{
case "X":
case "Q":
RaiseQuitEvent();
return;
case "D":
DisplayImage();
goto case "T";
case "T":
DisplayTitle();
break;
default:
MessageBox.Show("Please enter X, Q, D or T");
break;
}

As the above code shows, the syntax of the switch statement consists of the switch keyword, followed by the name of a variable in parentheses. This is the variable whose value is to be tested, and it may be of any of the predefined data types. There then follows a block statement that consists of a number of case clauses. Each case clause gives a possible value of the test variable, followed by a mandatory colon - failure to supply this colon constitutes a syntax error, and in addition the colon allows a case clause to be the destination of a goto statement.

When execution enters a switch statement, the value of the test variable is compared with each of the alternatives presented by the various case clauses until a match is found. Then the instructions following that case clause are executed. If none of the case clauses match and a default clause is present, then the instructions following the default : are executed instead. If no default clause is present and no match is found then execution simply continues at the next statement following the switch statement.

As the above example shows, it is possible for case clauses to immediately follow each other - this is how we provide for the same set of instructions to be executed for more than one value of the test variable.

One syntactical requirement of switch statements is that each conditional sequence of instructions must end with some statement that explicitly indicates where the flow of execution should continue. It is a syntax error for a path of execution to simply hit the following case clause.

      case "D":
DisplayImage(); // WRONG - execution simply
case "T": // hits following case clause

There are three possible such terminating statements:

  • break exits the switch statement and causes execution to continue at the first statement following the switch statement.
  • return is more drastic. It causes the entire containing method to be exited.
  • goto <label> transfers execution to the named label. It is common for this to be used to transfer control to another case label (if fact this is the only way to do this). In the above example, if the user hits D then both the image and the title should be displayed.

Note that it is not possible to use the switch statement to test whether a variable lies within a range or to test the return value of a complex expression (unless you assign the return value to another variable first). You will need to use the if statement for those kinds of situations.

break

This statement causes early termination of an immediately containing for , foreach , while , do or switch statement. On encountering a break , control immediately transfers to the next statement following the enclosing loop. We have already seen this behavior in the switch statement.

The most common uses for break are to terminate case clauses in a switch statement, and to provide an alternative means of exiting a loop where more than one condition can require termination. For example, the following uses the .NET base class, StreamReader, to read a file until either then end of the file is reached or a line containing " Exit " is reached.

StreamReader sr = new StreamReader(@"C:\ReadMe.txt");
string NextLineOfFile;
while ( (NextLineOfFile = sr.ReadLine()) != null)
{
if (NextLineOfFile == "Exit")
break;
// do whatever other processing is needed;
}
sr.Close();

continue

This is similar to break except that instead of exiting the entire loop, we simply exit that iteration early, and immediately start the next iteration of the loop. Continue is valid only inside for , foreach , while or do loops. It is not valid inside switch statements.

The following extends the previous example by allowing for a line that contains the text " Ignore " to not be processed (in other words, only this line is skipped - processing continues on the next line of the file).

StreamReader sr = new StreamReader(@"C:\ReadMe.txt");
string NextLineOfFile;
while ( (NextLineOfFile = sr.ReadLine()) != null)
{
if (NextLineOfFile == "Exit")
break;

if (NextLineOfFile == "Ignore")

continue;

// do whatever other processing is needed;

}

sr.Close();

return

This statement causes immediate termination of execution of the method that is currently being executed. Control transfers back to the calling method. All variables that our scoped to the method being executed will, of course, go out of scope as control is returned.

The return statement may take one parameter, which must be of the return type of the method currently being exited (can be cast to that data type). This will be the return value of the method. If the method is defined to return void, then the return statement takes no parameters.

No brackets are needed for the parameter of a return statement.

public bool IDIsValid(string ID) // ID is valid if first character is 'd'

{

if (ID == null) // or in this case, might alternatively

return false; // throw ArgumentNullException

if (ID[0] == 'd')

return true;

return false;

}

Arrays

An array in C# is actually an instance of the .NET base class, System.Array . However, C# wraps some special syntax around this class in order to make using its methods much more intuitive. Hence, for the most part, we can simply treat arrays as if they a part of the C# language syntax itself, and are not worry about the fact that behind the scenes we are actually dealing with this base class.

Declaring and Initializing an Array

The simplest way to declare an array is like this

decimal [] Salaries;

Here the square brackets after the type name tell the compiler that we are actually declaring not a decimal, but an instance of System.Array , which will hold decimal s. (We've given this example for decimal but the same syntax applies for any other data type, value or reference). What the above code will actually do is declare a reference that can refer to a System.Array instance that holds a one-dimensional array of decimal s. This reference is called Salaries . At this stage, we have not actually instantiated an array - to do this, we need to call the System.Array constructor - which is done using the new keyword, as for most other data types. The only difference is that we need to supply the number of elements in the array, which is done using square brackets.

decimal [] Salaries = new decimal[5];

The dimension (or rank) of the array as well as the data type it holds are formally considered part of the type definition, and so are part of the definition of Salaries . However, the length of the array (the number of elements in it) is considered to be an initial value, not part of the type, and so is specified only on the right hand side of the above statement.

It's important to be aware that the elements of an array are formally considered as fields inside a System.Array class instance, and so will always be initialized by being zeroed out. Also, because Array is itself a reference type, the elements of it will be stored inline on the heap, even if they are value types.

If we wish to special to specify alternative initial values for the elements we can do so like this.

decimal [] Salaries = new decimal[5] {30000.00M, 35000.00M, 67500.00M,
100000.00M, 50000.00M

Since the size of the array is obvious from the initialization list, we don't need to explicitly indicate it if we are using the above syntax.

decimal [] Salaries = new decimal[] {30000.00M, 35000.00M, 67500.00M,
100000.00M, 50000.00M

As with the other simple data types, C# recognizes an alternative syntax that does not used the new keyword.

decimal [] Salaries = {30000.00M, 35000.00M, 67500.00M, 100000.00M,
50000.00M};

In this case, once again, the compiler can work out from the contents of the curly braces how big the array is supposed to be.

Array Size

The size of an array can be specified at runtime rather than compile time. However, it cannot be adjusted once the array has been constructed.

int X;
// do some processing to figure out value of X
decimal [] Salaries = new decimal[X];

The size of a 1-dimensional array is subsequently available as the read-only Length property

int ArraySize = Salaries.Length;

Using Arrays

Once an array has been constructed, its elements are then accessed by specifying the index in square brackets. It is treated just like an ordinary variable of the appropriate type. Arrays are zero-indexed, so that, for example, an array of size 5 has elements numbered from 0 to 4.

Salaries[0] = 40000.0M;
decimal MrBrownsSalaray = Salaries[3];
Salaries[4] *= 2;

Bounds checking is automatically performed for arrays. If you attempt to access an element outside the range of the array, an IndexOutOfRangeException will occur at runtime.

// assuming Salaries.Length = 5;
Salaries[5] = 10000.0M; // will cause an exception to be thrown.
//Max. value for index is 4.

Rectangular Multidimensional Arrays

It is also possible to declare rectangular arrays of more than one dimension, by using commas inside the square brackets to indicate the dimensions. The number of dimensions (which is technically known as the rank of the array) will always be one greater than that the number of commas in the definition. The following code uses an array of rank 2.

double [,] TransformMatrix = new double[3,3];
for (int I = 0 ; I < 3 ; I++)
{
for (int J=0 ; J<3 ; J++)
TransformMatrix[I,J] = 4;
}

Arrays of this type are known as rectangular arrays because each 'row' of the array has the same number of elements. In the above example, the array has 9 elements - 3 rows x 3 columns.

It is possible to instantiate multidimensional arrays with as large a rank as you want. For example, to declare a rank 5 array of string:

string [ , , , ,] MassiveArray;

Jagged Arrays

In C#, if you wish to define a multidimensional array in which not all 'rows' have the same numbers of elements, it is possible to do so. Such an array is technically known as a jagged array, and is constructed by effectively forming an array of arrays. The syntax for jagged arrays involves specifying each dimension in a separate pair of square brackets. To give a 2-dimensional example:

// construct array
string[][] Text = new string[3][];
Text[0] = new string[2];
Text[1] = new string[4];
Text[2] = new string[2];

// initialize some of the elements
Text[0][0] = "1st word of 1st line";
Text[0][1] = "2nd word of 1st line. The 1st row of this array has just " +
"these two elements";
Text[1][0] = "And the second row has 4 elements";
// etc.

As with rectangular arrays, it is possible to define jagged arrays of higher rank as well - though for higher rank jagged arrays initializing them takes quite a bit of work! The above code shows that each individual 'column' of a jagged array needs to be declared separately, as a separate array in its own right. Jagged arrays are useful for any situation in which groups of data might have different lengths. For example you might represent an array of details of purchases made by customers in a 2-dimensional jagged array:

Purchase [][] Purchases = new Purchase[][]; // purchase is a custom defined
// class

// later and given appropriate variables
Purchase SomePurchase = Purchases[CustomerIndex][PurchaseIndex];

A rectangular array would not suitable here because each customer will have most likely made a different number of purchases.

Operators

C++ recognizes a large number of operators. An operator is basically a symbol, which, when encountered, causes some processing to be done that returns a value.

NewBalance = OldBalance + Interest;

This code actually formally contains two operators. The + operator adds the values of the two variables to its immediate left and immediate right and returns their sum. (These two variables can be considered to be the parameters passed to the + operator). Then the = operator assigns the variable on the left the value of the expression or variable on the right - hence storing the sum in NewBalance . Operators always return a value, and = returns the value assigned. This means that we can, if we want, set more than one variable to the sum, by writing:

CopyOfNewBalance = NewBalance = OldBalance + Interest;

Note that the fact that this statement works as described depends on the compiler arranging for the operators to be called in the appropriate order. We expect what to happen to be this.

CopyOfNewBalance = ( NewBalance = ( OldBalance + Interest) );

This is what actually happens because C# has strict rules of operator precedence - which operator in an expression gets evaluated first. According to these rules, + has a higher precedence than = , which means that any occurrences of + in an expression will be evaluated before any occurrences of = . If more than one = occurs, these will be evaluated right-to-left. The rules for operator precedence are detailed in the MSDN documentation, but have been carefully designed so that the result of any expression is what would intuitively be expected. For example, in the expression I + J * K , the multiplication is evaluated first then I added to the result.

An important point to understand with C# operators is that they always return a value, but they sometimes have some side effect of setting some variable. We've just seen an example of this, in which = returns a value but also has the effect of setting the value of the variable on its left. For the case of = , this side effect is usually the main reason why we've chosen to use the operator.

As another example, assuming X is some numeric type, in the statement:

if (X == 0)
DoSomething();

The == operator returns a value, which is of type Boolean and will have the value true if both parameters ( x and 0 ) happen to have the same value, and false otherwise. This particular operator does not have any side-effects.

Note the difference between = (assignment operator) which sets a value, and == (comparison operator) which compares values. It is important not to confuse these operators.

An operator can in many ways be regarded as a more intuitive syntax for a method call. For example, our last example can be regarded as a simpler way of writing the following (which for numeric types will have the same effect).

if (X.Equals(0))
DoSomething();

Syntax of Operators

Operators can formally be regarded as taking one, two or three parameters, depending on the operator. An operator that takes two parameters is known as a binary operator , and is written between the two parameters. This is known as infix notation and is the notation we are used to in everyday life - hence the following syntactically correct expressions have their intuitive meanings.

A + B // returns sum of A and B

Y = 0 // assigns 0 to Y

Y == 0 // returns true if Y contains 0

Note that the above snippets do not include terminating semicolons because they represent expressions rather than complete statements.

Usually, if an operator takes one parameter, this parameter is written to the operator's immediate right (prefix notation - the 'pre' refers to the position of the operator, not the operand).

-X // reverses the sign of the expression

However two operators, increment ( ++ ) and decrement ( -- ), may be written with the parameter either to the right (prefix notation) or the left (postfix notation), and they have different meanings according to the position:

++X // adds 1 to the value of X and returns new value of X
X++ // adds 1 to the value of X and returns old value of X
--X // subtracts 1 to the value of X and returns new value of X
X-- // subtracts 1 to the value of X and returns old value of X

Operators that take one parameter are known as unary operators .

C# has one operator that takes three parameters, the ternary operator. This operator has two symbols, ? and : , and is written with these symbols placed between the parameters (infix notation, again).

X ? Y : Z

The first parameter of the ternary operator must be a bool (or an expression that returns a). The ternary operator evaluates the first parameter, then returns the second parameter if the first parameter returns true , and the third parameter otherwise. For example:

// Result is a numeric variable
string SignOfResult = (Result >= 0) ? "Positive" : "Negative";

Note that in general, the parameters passed to the operators can be either variables or expressions. However, if the operator assigns new values to any of its parameters, then those parameters must be variables.

There are a number of different operators that take different numbers of parameters but are represented by the same symbol. For example, unary - returns the result from reversing the sign of the parameter, while binary - works out the value obtained by subtracting the second parameter from the first:

A-B // binary -
-B // unary -

There is in practice never any confusion - it is always clear how many parameters there are and hence which meaning is intended.

Meanings of the Operators

For most of the operators, it is easiest to understand what they do from observing the context rather than looking at their strict definitions. Most of the operators have numerous overloads, depending on the data types of their operands. For example, in the expression X=A+B ; the action of the + operator depends on the types of A and B . If A and B are int s, it will return their sum as an int . On the other hand, if A and B are double s, it will return their sum as a double . Intuitively this seems like the same process, but strictly speaking adding two int s and adding two double s are different operations that require different assembly language instructions. It is also possible to overload the addition operator for your own classes - and, of course, strictly you can overload it to do whatever you want, but your code will be hard to understand if your overload doesn't correspond to what most people would understand as addition in the context of your class.

In the following tables, we will list all the operators, providing examples of their syntax and an intuitive description of their return value and side effect when they are applied to the predefined data types. Unless otherwise indicated, the operators can be applied to all numeric types (but not string ); only + , += and = can be applied to string . However, the bitwise and shift operators can only be applied to integer types, not to float or double .

Although you can overload operators to have meanings for your own classes and structs, you should note that, in general, the operators have no meaning when applied to classes etc. unless you define overloads. The only exceptions are as follows:

  • Assignment ( =) is defined for all reference types to copy the reference (not the instance of the class) and for all value types to copy the memory that the variable occupies.
  • Comparison ( == and !=) are defined for reference types to compare the reference (but they are overloaded for strings to compare the text of the string). For value types that you define, that is structs, == and != are undefined unless you overload them.

Unary operators

These operators all act on one variable. Their return type is the same as the type of variable that they operate on. In the following table, the syntax column uses the symbol X to represent the parameter.

SymbolNature of actionSyntaxReturn valueSide-effect
-sign reversal-X-X 
++increment++XX+1Adds 1 to X
++incrementX++XAnds 1 to X
--decrement--XX-1Subtracts 1 from X
--decrementX--XSubtracts 1 from X
*dereference*XContents of memory location with address X 
&address-of&XAddress in memory of variable X. (X must be a variable, not an expression) 
!logical NOT!Xtrue if X is false, false otherwise (X must evaluate to bool) 
~bitwise complement~XResult from reversing each bit in X (X must be an integer type) 

Binary operators

The binary operators all return the same data type as their operands (which must both be castable to the same data type), except for the comparison operators which all always return bool.

SymbolNature of actionSyntaxReturn valueSide-effect
=assignmentX=YYSets X to have same value as Y. (X must be a variable, not an expression)
==comparisonX==Ytrue if X and Y are equal, false otherwise 
!=comparisonX!=Ytrue if X and Y are not equal, false otherwise 
>comparisonX>Ytrue if X is greater than Y, false otherwise 
<=comparisonX<=Ytrue if X is less than or equal to Y, false otherwise 
<comparisonX<Ytrue if X is less than Y, false otherwise 
>=comparisonX>=Ytrue if X is greater than or equal to Y, false otherwise 
+additionX+YSum of X and Y 
-subtractionX-YValue of X-Y 
/divisionX/YValue of X divided by Y 
*multiplicationX*YValue of X times Y 
%remainderX%YRemainder on dividing X by Y, where X and Y are integers 
+=addition-assignmentX += YValue of X+Y Assigns result to X (X must be a variable, not an expression)
-=subtraction-assignmentX -= YValue of X-Y Assigns result to X (X must be a variable, not an expression)
/=division-assignmentX /= YValue of X divided by Y Assigns result to X (X must be a variable, not an expression)
*=multiplication-assignmentX *= YValue of X times Y Assigns result to X (X must be a variable, not an expression)
%=remainder-assignmentX %= YRemainder on dividing X by Y, where X and Y are integersAssigns result to X (X must be a variable, not an expression)
&&logical ANDX && Ytrue if X and Y are both true, false otherwise. (X and Y must both evaluate to bool) 
||logical ORX || Ytrue if either X or Y or both are true, false otherwise. (X and Y must both evaluate to bool) 
&bitwise ANDX & YResult of performing a bitwise AND operation on X and Y 
|bitwise ORX | YResult of performing a bitwise OR operation on X and Y 
^bitwise XORX ^ YResult of performing a bitwise exclusive-OR operation on X and Y 
&=bitwise AND-assignmentX &= YResult of performing a bitwise AND operation on X and Y. Result stored in XAssigns result to X (X must be a variable, not an expression)
|=bitwise OR-assignmentX |= YResult of performing a bitwise OR operation on X and Y. Result stored in XAssigns result to X (X must be a variable, not an expression)
^=bitwise XOR-assignmentX ^= YResult of performing a bitwise exclusive-OR operation on X and Y. Result stored in XAssigns result to X (X must be a variable, not an expression)
>>right-shiftX >> YResult of bit-shifting X right by Y bits 
<<left-shiftX << YResult of bit-shifting X left by Y bits 
>>=right-shift-assignmentX >>= YResult of bit-shifting X right by Y bitsAssigns result to X (X must be a variable, not an expression)
<<=left-shift-assignmentX <<= YResult of bit-shifting X left by Y bitsAssigns result to X (X must be a variable, not an expression)

Note that the symbol <> is not recognised as an operator. To test for inequality you must use !=.

Ternary Operator

SymbolNature of actionSyntaxReturn valueSide-effect
? :ternary operatorX ? Y : ZY if X is true, Z otherwise (X must evaluate to bool) 

Classes

Classes are defined using keyword class with the following syntax

class Vector
{
public double X;
public double Y;
public double Norm()
{
return X*X + Y*Y;
}
}

If no base class is specified then the base class is assumed to be System.Object . A class is always derived from one other class and may additionally be derived from any number of interfaces.

class BetterVector : Vector, IEnumerable, ICollection, IFormattable
{
// members
}

The interfaces used in the above example are all .NET base class interfaces.

If a class is derived from another class, then all members of the base class are automatically also part of the derived class (although any method that has private access will not be visible to code in the derived class). The derived class may, however, implement its own versions of any members - either overriding or hiding the corresponding base class method.

If a class is derived from an interface, then the class must provide implementations of all members of the interface; it is a syntax error for it not to do so.

Each member of a class should be given one of the access modifiers public , a private , protected , internal , and protected internal . The meanings of these are

KeywordMeaning
publicThe member is visible to all other code outside the class
privateThe member is only visible within the class
protectedThe member is only visible within the class and any derived classes
internalThe member is only visible within the assembly in which has been defined
protected internalThe member is only visible within that class and within any inherited classes that are also the same assembly. It is not visible outside the assembly

If no access modifier is indicated explicitly for a member, then that member will be private.

Types of Class Members

Members of a class may include any of the following

Type of memberDescription
Field A member variable.
MethodA function call - a set of code that performs some task.
PropertyA method call or a pair of method calls syntactically dressed to look to callers like a field.
IndexerA method call or a pair of method calls syntactically dressed to look to callers as if the containing object is an array and we are accessing an element of this array.
ConstructorA method that is automatically called when a class is loaded or a class instance is instantiated in order to initialise the class or class instance.
DestructorA method that is automatically called when a class instance is garbage collected.
Custom castA method that is automatically called as needed to carry out conversions between instances of this class and instances of some other class.
EventA variable that contains references to methods and which is used in the Windows event handling architecture.
Definition of another data typeDefinition of a class, struct, enumerator or delegate.

Static and Instance Class Members

By default, members of a class are instance members. For the case of fields and events, this means that one copy of this field is independently stored with each instance of the class. The data in an instance field is associated with a class instance rather than a class. For all other members (the function members), only one copy of the code is stored, no matter how many class instances are declared. However, the code for a function is able to access instance fields because, when called, it is passed an additional hidden parameter, which is not seen by the developer, which gives a reference to the class instance against which the method has been called. In this way, instance methods are still associated with a class instance. The hidden parameter is accessible via the this keyword.

Members of a class may also be declared as static. If a member is declared as static, then it is associated with the class rather than with any instance. Members are declared as static by marking them with the static keyword.

class CustomerName
{

   public static ushort MaxLength;

For fields, only one copy of a static field is stored, and all instances of the class have access to the same copy of this field. If one class instance modifies the copy of a static field, then the changed value applies to all other instances of the same class (or to any other classes that have access to this field).

If a function is declared as static, then it is not passed any hidden parameter that gives any instance of a class. Hence static functions cannot be associated with a class instance. It is a syntax error for any function that has been declared as static to access any instance data: Static methods may only access static fields.

Any field may be declared as static, with the exception that const fields are always implicitly static, and so should not be explicitly declared as such:

public const ushort MaxLength = 10; // automatically static

Methods, indexers and properties may be declared as static. Constructors and destructors may not be declared as static, except for the special case of the static constructor, which has a precise syntax as described later in the appendix. Casts may only be declared as static.

Definitions of other data types should not be declared as static.

Accessing Members of a Class

Members of the class that are not static are accessed from code outside the class by supplying the name of the class instance followed by the scope resolution operator, the dot.

CustomerName NextCustomer; //CustomerName defined as a class
NextCustomer.Name = "Arabel Jones";

Member static members of the class are accessed from code outside the class by supplying the name of the class followed by the scope resolution operator.

   ushort Length = CustomerName.MaxLength;

For code within a class, members of that class may be accessed either by simply giving their name, or by prefixing the name of the member by the this keyword.

// in CustomerName definition
Length = MaxLength;

If, within a method, another local variable is declared that has the same name as a class member, then the local variable hides the class member. In that case, supplying the name by itself will be interpreted as referring to the local variable, and the class member can only be referred to using the this syntax.

class CustomerName
{

public CustomerName(string name)
{
this.name = name;
}

Recall that within an instance method, the this keyword accesses the hidden parameter passed to the method that represents the object against which the method has been called.

If the code within a method needs to access a member of the same instance that has been defined in the immediate base class, then it may do so using the base keyword. This is only necessary if method has been hidden or overridden in the derived class; this is the usual reason for using base . (The base keyword is the equivalent of Java's super keyword.)

class OurControlWithGreenText : OurCompanysGenericControl
{
public override void DisplayText(string text)
{
TextColor = Color.Green;
base.DisplayText(text);
}
// etc.

(Note that despite the suggestive names, the classes and methods in this example are made up - they are not, with the exception of Color.Green - part of the .NET Windows Forms base classes).

The this Reference

The this keyword is used when you need a reference within a method to the containing class instance. As noted above, the this reference provides an alternative syntax for accessing other members of the class instance.

However, this is also treated as variable in its own right. For example you may use it in order to pass a reference to the containing class as a parameter to another method. This would commonly be done in order for one class to provide another class instance with a reference to itself. (Note, the example below only makes sense for classes, not structs):

   class MyTreeElement
{
MyTreeElement(MyTreeElement parent)
{
// etc.
}

// somewhere in the code
MyTreeElement Child = new MyTreeElement(this);

Inheritance

As mentioned in the section of classes, all classes derive directlyfrom just one other class, except for System.Object which does not derive from anything else. (Note that indirectly a class may derive from many classes, as in the following example: OurNewGroovyZippyHardDrive derives directly from HardDrive , but indirectly (via HardDrive ) from Device , and in turn from System.Object .

Consider this situation:

class Device
{
// methods etc.
}

class HardDrive : Device
{
// methods etc.
}

class OurNewGroovyZippyHardDrive : HardDrive
{
// methods etc.
}

Inherited classes gain all members of the base class. So, for example, all fields, methods, indexers, etc. and all other members of Device are all automatically members of HardDrive . HardDrive can also define any new members of its own. These, along with the ones defined in Device , are automatically also present in the OurNewGroovyZippyHardDrive class. Note, however, that any member that is declared as private, although present, is not visible to any code written in derived classes.

For the case of methods etc., a derived class takes the implementation of each method from the base class. However, you may choose to supply alternative implementations of any method in any derived class. What the compiler does in this situation is discussed later in this appendix, in the section on method overloads and method hiding.

The structure of which class is derived from which other class is known as the class hierarchy. A derived class is sometimes known as a subclass, and a base class as a superclass. Because a class may have many subclasses but only one superclass, the class hierarchy has the appearance of a tree structure:

Image 1

It is a syntax error to form any circular dependencies in the class hierarchy.

References to Derived Classes

A characteristic of C# references is that a variable that is declared as a reference to some class can alternatively point to any derived class. So, for example, continuing the above example, suppose we write:

HardDrive SomeHardDrive;

As you would expect, we could write

SomeHardDrive = new HardDrive();

However, it is also perfectly legitimate to then code up this:

SomeHardDrive = new OurNewGroovyZippyHardDrive();

On the other hand, we cannot set references equal to base classes of the referent type.

SomeHardDrive = new Device(); // WRONG

Since all classes are derived from System.Object , a reference to Object can be set equal to any instance of any class whatsoever.

Object Anything; // can set this equal to any object

Note that you would generally write the above code as follows as C# uses the object keyword to represent instances of System.Object ;

object Anything; // a better way of saying the same thing

The reason that it is fine for references to refer to derived classes is that a derived class automatically implements all the fields and methods etc. of a base class. In a sense, a derived class is always the base class plus a bit more. So, one way of looking at it, is that in the above examples, SomeHardDrive is roughly speaking pointing to a HardDrive instance - it's just that it's a HardDrive instance that has some stuff added to turn it into an OurNewGroovyZippyHardDrive . Note, however, that any methods etc. called through a base class pointer must be defined in the base class, otherwise a compilation error will result.

HardDrive SomeHardDrive;
SomeHardDrive = new OurNewGroovyZippyHardDrive();
SomeHardDrive.DoSomething(); // OK if DoSomething() is a member of
// HardDrive. Compilation error if
// DoSomething() is only defined
// in OurGroovyZippyHardDrive

Casts Between Base and Derived Classes

Since a derived class is automatically also effectively an instance of any base classes, then casting from a derived class to a base class is guaranteed to work, and so may be done implicitly. In particular this means that any class may be implicitly cast to object . Continuing the above example,

// SomeHardDrive declared as type HardDrive
Device OtherRefToHardDrive = SomeHardDrive;
object YetAnotherRef = SomeHardDrive; // Implicit casting to a base class OK

On the other hand, casting from a base class to a derived class is risky. If the object pointed to by the reference is not of the correct type, then an exception could be thrown. Hence casting this way must be done explicitly.

Device SomeDevice2 = new Device();
Device SomeDevice3 = new HardDrive();
HardDrive SomeHD2 = (HardDrive) SomeDevice2; // will compile but raise
// exception on execution as
// SomeDevice2 doesn't refer to
// a HardDrive instance
HardDrive SomeHD3 = (HardDrive) SomeDevice3; // will compile and run OK

If a reference variable currently stores the null reference then the cast will succeed, but will return the null reference. If you want to cast between objects and guarantee that an exception won't be thrown, then you should use the as keyword as described in the section on data types. This will return null if the cast fails.

Method Overloads and Method Hiding

It is possible to define any method, indexer, operator overload, or property as virtual . This has the usual meaning for object-oriented programming: It indicates that a runtime check should be made of which class an object is an instance of before selecting the appropriate method. For methods that are not explicitly declared as virtual, the compiler uses the object reference to determine the method to be called. The difference is significant because of the rule in C# about references being able to refer to derived classes as well as their referent data type.

For example, suppose we have the following classes, with one virtual and one non-virtual method defined in each:

class DatabaseUserAccount
{
public bool Login(string password)
{
// implementation
}
public virtual long GetAdminRights()
{
// implementation
}
public void DoSomethingElse()
{
// implementation
}
}

class DatabaseAdminAccount : DatabaseUserAccount
{
public override long GetAdminRights()
{
// implementation
}
public new void DoSomethingElse()
{
// implementation
}
}

We note that:

  • Login() is declared only in the base class;
  • GetAdminRights() is virtual in the base class and overridden in the derived class
  • DoSomethingElse() is not virtual, but defined in both classes.

Now consider the following code:

DatabaseUserAccount User1 = new DatabaseUserAccount();
DatabaseUserAccount Admin1 = new DatabaseAdminAccount();
DatabaseAdminAccount Admin2 = new DatabaseAdminAccount();

// assume there is code here that correctly initializes these variables.
// note that Admin1 is an admin account but referenced through a
// user account reference

// comment indicates which version of this method is called
User1.Login("Password"); // DatabaseUserAccount
Admin1.Login("Password"); // DatabaseUserAccount
Admin2.Login("Password"); // DatabaseUserAccount

User1.GetAdminRights(); // DatabaseUserAccount
Admin1.GetAdminRights(); // DatabaseAdminAccount
Admin2.GetAdminRights(); // DatabaseAdminAccount

User1.DoSomethingElse(); // DatabaseUserAccount
Admin1.DoSomethingElse(); // DatabaseUserAccount
Admin2.DoSomethingElse(); // DatabaseAdminAccount

The reasoning behind this is as follows:

  • User1 is a DatabaseUserAccount instance referenced through a DatabaseUserAccount reference. Both types (reference and referent) are the same, so the code will look in the DatabaseUserAccount class to identify all methods to be called.
  • Admin1 is a DatabaseAdminAccount instance referenced through a DatabaseUserAccount instance. Therefore, the code will look in DatabaseUserAccount to identify all non-virtual methods, and DatabaseAdminAccount to locate virtual or overridden methods. In this case, the only such method is GetAdminRights() , so the derived class DatabaseAdminAccount version of this method is called.
  • Admin2 is a DatabaseAdminAccount instance referenced through a DatabaseAdminAccount instance. Both types (reference and referent) are the same, so the code will look in the DatabaseAdminAccount class to identify all methods to be called. However, in the case of Login() , no such method is available, so the we search back up the class hierarchy to identify a method. The first one we find is the version in DatabaseUserAccount() .

In general, virtual methods are slightly slower to execute, because the computer does not know until runtime what instance a reference is referring to - hence a runtime check needs to be made. If the method is not virtual, then the compiler can work out at compile-time which method should be called.

Functions (methods, properties, indexers) should be declared as virtual if it is likely that derived classes will need to implement their own versions of them. If a method is virtual then it is guaranteed that the correct version of it for any given class instance will always be called, no matter what the reference type used to obtain that instance is. If you do not intend for derived classes to supply their own implementations of a method, then for performance reasons it should not be declared as virtual. This situation is illustrated in the above example: The process of logging in is likely to be identical for all accounts so the Login() method is not virtual. But a user's administrative rights depend on the type of account, so this method is likely to be overridden. Hence, it is declared as virtual. You cannot define constructors or casts as virtual, and destructors are always automatically virtual.

Define methods, indexers and properties as virtual if derived classes might need to implement different versions of them.

The DoSomethingElse() method is not virtual, but has had a new version of it defined in the derived class. Usually, you would not deliberately code up this situation. It may occur, however, if a new version of a base class is released which has new methods, one of which happens to have the same name as a method in the derived class. The new method is said to hide the base class version.

If a method is virtual, then all overrides of it must be declared as override. If a method is not virtual, then any implementations in derived classes that hide it should normally be declared as new . Not doing so is not a syntax error, but it will raise a compiler warning. This warning will ensure that developers will know if their method is 'accidentally' hiding a method in the base class, and they can decide what action to take.

Abstract Methods

A method etc. can be declared as abstract . This means that it is virtual, and also that no implementation is being provided in the base class. The usual situation for doing this is if your base class is generic, and intended only to be derived from rather than to be instantiated.

abstract class Device // generic device that can represent any hardware
// device attached to a computer
{
abstract string GetDeviceName();

If a class contains any abstract methods then the class is also considered to be abstract, and must be marked as such explicitly in the class definition, as above. It is a syntax error to attempt to instantiate an abstract class.

Device SomeDevice; // OK - just declaring a reference
SomeDevice = new Device(); // WRONG - an attempt was made to instantiate
// a class that has abstract methods

It is fine to derive classes from an abstract class, but it is a syntax error to instantiate any instances of such classes unless they provide overrides to all abstract methods that they have inherited.

class HardDrive : Device
{
override string GetDeviceName()
{
// implementation
}

// later in code
SomeDevice = new HardDrive() // OK

The rules for abstract methods are designed to prevent you from ever instantiating a class that does not have implementations for all its methods.

In general you will use an abstract class in any situation in which a number of classes have common functionality - in our example hereDevice is useful as an abstract class because it can be used to encapsulate functionality that is common to many particular devices - hard drive, cdromdrive, etc. However there's no such thing as a generic device in real life so you probably wouldn't want to actually instantiate a Device. By using an abstract class in this situation you can provide more structure to the class hierarchy, and provide common implementations of those methods that are common across devices, while leaving abstract implementation-less methods in Device for those methods whose implementations will be specific to each type of device.

The opposite to abstract classes are sealed classes, from which you are not allowed to derive other classes. They are marked by the sealed keyword in the class definition:

public sealed class MyUnderivableClass
{
// Implementation
}

The reasons for doing this may have something to do with security or design. The System.String class of the .NET base class library is an example of a sealed class. It follows that because abstract and sealed do the opposite task (one requiring class derivation, the other prohibiting it) it is nonsensical to have the two keywords applied to the same class.

Structs

Structs are intended for the situation in which you need small, lightweight, objects that do not have the full functionality of classes. They are particularly useful when an object simply consists of a couple of pieces of information grouped together.

You can declare as struct using a similar syntax to that for a class, except that we use the keyword struct instead of class

struct Booking
{
public string FlightID;
public string CustomerID;
}

The comments in previous sections of this appendix about syntax for classes also apply to structs, apart from the use of the keyword struct in the initial declaration, and apart from the following differences.

  • You may not declared a struct as a deriving from any other class or struct. Structs are always implicitly assumed to be derived from System.ValueType . Structs may, however, be additionally derived from other interfaces in the same way as classes.
  • struct Vector : IEnumerable

  • It is not possible to define destructors for structs. As noted below, structs are value types and hence are simply removed from the stack whenever they go out of scope.

In addition, we will note here the following other differences between use of classes and structs which are described in the appropriate part of the appendix.

  • Member methods of structs may not be declared a virtual.
  • It is not possible to define a no-parameter constructor for a struct. Instead, the compiler always the supplies a default constructor (even if you have defined other constructors).
  • Structs are value types whereas classes are reference types.

Fields

A field is C# terminology for variable that has been defined as a member of a class or struct. The rules for declaring and instantiating fields are the same as those for declaring variables that are local to methods. Note that it is quite acceptable to initialise a field by indicating the constructor to be called or the initial value of the field from within the class definition.

class Automobile
{
ushort NWheels = 4;

The preferred way to initialise a field for which the initial value is known at compile time is in the class definition as above. However, fields may also be initialised in constructors.

class Truck
{
ushort NWheels;

Truck(string Make, string Model)
{
// code to check against the database how many wheels
// this sort of vehicle has. Assume result is returned as X
NWheels = X; // whatever value we find out
}

By initializing a field in a constructor you gain the flexibility of being able to initialise it with a value computed at runtime. However you need to take care to ensure that the value is initialised correctly in all constructors that you define.

const Fields

A field may be declared as const , in which case is treated as a fixed value rather than a variable. Its value must be assigned in the source code with its declaration and must be known at compile time.

class Password
{
public const uint MaxLength; // WRONG - no initial value
public const uint MaxLength = 20; // OK
public int X = 20;
public const uint MaxLength = X; // WRONG - const set to a non-const
// value. Hence compiler can't be sure of the value
public const uint TwiceMaxLength = 2*MaxLength;
// OK. Compiler can work this out

In this regard, the rules for const fields are identical to those for const local variables.

One important point about const fields is that they are always implicitly static (although it is a syntax error to explicitly declare them as such)

class Password
{
public static const uint MaxLength = 20;
// WRONG - const is already static. No need to declare it as such

Because const fields are implicitly static, they must be accessed as for static variables - using the name of the class, not the name of the object

   Password ApplicationPassword = new Password();
uint Length = ApplicationPassword.MaxLength; // WRONG -
// static variable mustn't be accessed using variable
// name
uint Length2 = Password.MaxLength; // OK -
// correct syntax for static field

readonly Fields

Read-only fields are intended for the situation in which you want a field to be treated as a constant, but you will not know until runtime what the value will actually be. The syntax for declaring a read-only variable is as shown in this example, in which we assume we are writing a class to represent Internet sessions.

class Session
{
protected readonly string SessionID;

Note in this example, for the sake of variety, we've illustrated a protected field, but read-only fields can have any access level set, as can const fields.

Unlike const fields, read-only fields do not have to have to be initialised in the class definition, though you may do so if you wish. The normal practice is to initialise read-only fields in the constructor(s).

   public Session(string IPAddress)
{
SessionID = GetIDFromDatabase(IPAddress);
}

Read-only fields may be set in constructors, but may not be assigned to anywhere else. Hence their values remain constant from the moment that the containing class has been constructed.

In contrast to const fields, read-only fields are not necessarily static. As the above example shows, it is perfectly possible for each instance of a class to have its own value of a read-only field. However, you may declare read-only fields as static if you wish.

class Session
{
public static readonly ServerIPAddress;
protected readonly SessionID;

The only place where static read-only fields may be assigned to is inside the static constructor.

   static Session()
{
ServerIPAddress = GetMyIPAddress();
}

Note that it is not permitted to assign to static read-only fields inside any instance constructor (though they may be initialized inside the class definition).

Methods

A method is a function call against the class instance which may be called at any time using the syntax for a function (as opposed to properties and indexers, which are wrapped in a syntax that makes them look like fields and array elements to the calling code).

Calling Methods

The syntax for calling a method is simply to write the method name, followed by any parameters. As with all other class members, it may be necessary to prefix the method name with the details of the class or class instance that contains the method. The following example uses two of the .NET base classes, System.Windows.Forms.RichTextBox and System.Windows.Forms.MessageBox .

RichTextBox TextBoxFile = new RichTextBox();
TextBoxFile.LoadFile(@"C:\My Documents\ReadMe.txt");
MessageBox.Show("File has been loaded"); // show is static method so
// class name used

If the method returns a value, then the method can be syntactically treated as a read-only variable from the point of view of obtaining the value.

if (MessageBox.Show() == DialogResult.OK)
DoSomething();

Defining Methods

The syntax is as follows.

public void DoSomething( /* any parameters here */)
{
// implementation
}

Note that this means the entire code for the method must be specified within definition of the class of which the method is a member.

The return type of the method must be indicated. This may be any data type known to the compiler (such as a predefined data type such as long or ushort , or the name of a user-defined data type). Alternatively, a return type of void may be indicated, which means that the method does not return anything.

Within the method, all execution paths must lead to an explicit return statement, except that in the case of a void method only, it is acceptable for the return to be implied when execution hits the closing curly brace that marks the end of the method definition:

   static public void Main(string[] args)
{
Console.WriteLine("Main method");
// return implied because method returns void
}

ref and out Parameters

By default parameters are passed by value to methods. This means that a copy of the value is made for the benefit of the method called. For value types this means that the method cannot alter the value of the variable passed in, as seen by the calling method. It also means that expressions as well as variables can be passed in.

int AddNumbers (int A, int B)
{
return A+B;
}

// later
int Result = AddNumbers(10, 2*X);

For reference types, it is possible for methods to alter the values of parameters since the copy made is a copy of the reference variable - which simply contains the address - the actual object referred to is not copied, but the method simply receives the address of where this object is stored.

If you do want a method to be able to alter the value of a parameter, you may declare the parameter as a ref parameter or as an out parameter. Either way will allow the parameter to be modified by the method.

If a value type is passed as a ref or out parameter to a method then the method actually receives the address of the variable - and hence can use this address to access or modify the actual variable from the calling routine. This also carries performance advantages for structs, since only the address is copied rather than the entire contents of the struct.

The ref and out parameters behave in the same way except that the rules for initializing them are different.

  • You should use a ref parameter when the intention is that the called method should modify the value of the variable. The C# compiler requires that ref parameters are initialized before being passed to a method.
  • You should use an out parameter when the intention is that the called method should initialize the value of the variable. They do not need to be initialized before being passed to a method, but within the called method the compiler regards them as uninitialized. Hence they must be assigned to before the method returns and before their values are accessed.

Use of ref and out parameters is illustrated by this example.

// converts value to its square and fills a bool to indicate sign of value
void Square(ref double value, out bool isNegative)
{
isNegative = (value < 0) ? true : false;
value *= value;
}

The ref or out keywords must be specified explicitly when calling a method that takes ref or out parameters:

double Quantity = 4.0;
bool IsNegative;
Square (ref Quantity, out IsNegative);

Method Overloads

It is permitted in C# to supply more than one method in a class with the same name, but different signatures (types and numbers of parameters). A good example of this is the Console.WriteLine() method that is part of the Console .NET base class, and which writes output to the console window. For example, the available overloads include

public void WriteLine(string value)
{
// writes out the string
}

public void WriteLine(bool value)
{
// writes out string representation of the bool
}

Defining overloads is no different to defining any other method. However, calling them involves slightly different principles, as the compiler will need to work out which overload of a method is the one you intend to call. When the compiler encounters a call to an overloaded method, it examines the number and types of the parameters being passed in. If an overload exists that takes those exact same types of parameters, then it calls that overload. If no exact match exists, then it will examine those overloads that require the same number of parameters, and attempt to cast the parameters actually being passed in to match one of those overloads. If it is able to match one overload by casting the parameters, it will pick that overload. If, by casting parameters, it is potentially able to match up to more than one overload, then it will pick the 'best' one using certain criteria. Full details are in the MSDN documentation, but in general you are advised not to code up in a manner that leaves uncertainty over which overload of a method will be called, but instead to explicitly cast the parameters to match the required overload.

Note that the return type does not distinguish overloads. It is not permitted to write overloads of a method that differ only in the return type.

Variable Argument Lists

It is possible to define a method that takes an unspecified number of arguments, using the params keyword, followed by an array as the parameter list:

void MyMethod(params object[] args)
{
// etc.

What then happens when the method is called is that the compiler takes the arguments passed in and packages them up into an array to pass to the method.

int X, Y;
string Z;
// initialize these variables
MyMethod(X, Y, Z); // MyMethod will receive an array of size 3

For the most general case, shown above, you will declare the array as of type object, which means that calling methods can pass in any variables whatsoever. If you know however that the variables to be passed in will all be derived from a certain class, you can declare the array to be of that class:

void MyMethod(params System.EventArgs[] args)
{
// etc.

Packaging up the parameters into an array does hit performance, so you are better off providing explicit overloads of the method for the particular sets of parameters you know you are likely to require. The compiler will only call the variable-argument-list overload of a method if it cannot find any other overloads that are suitable.

Properties

A property consists of pair of methods (or in some cases as detailed below, a single method) which are syntactically dressed up so that they look like a field.

Using Properties

The idea of a property is that code that calls the property effectively sees a member field.

MyControl.Width = 500;
int HalfWidth = MyControl.Width/2;

However, in fact the property contains two methods known as the get accessor and the set accessor . Attempts to set the property actually result in a call to the set accessor method, while attempts to return the value of the property actually result in a call to the get accessor method. The above example shows this clearly since, for any .NET control, setting the width actually changes the width of the control on the screen - hence some sequence of code must be being executed in response to the first line of the above code.

In order for this concept to work, the set accessor must return void , and take one parameter which is of the same data type as defined for the property. The get accessor must take no parameters and return the data type as defined for the property.

Defining Properties

The syntax for defining a property is as follows

public int Width
{
set
{
// code to set Width
}
get
{
// code to get Width
}
}

The syntax is as for defining a field, except that in place of the closing semicolon is a pair of braces containing the two accessor methods, which are marked by the keywords get and set . The two accessors may appear in either order.

As an example of this, we will assume that the property Width is implemented to simply access a member variable, width . The code will then look as follows.

   private int width;

public int Width
{
set // signature is implicitly void set (int value)
{
width = value;
}
get // signature is implicitly int set();
{
return width;
}
}

Notice the different case of the field width and the property Width. This kind of situation is a good example of how the case-sensitivity of C# can be exploited.

Note that the parameter supplied to the set accessor is not explicitly mentioned in the definition of the accessor. However, within the body of the accessor, the value keyword is used to indicate this parameter.

Either of the two accessors may be omitted from the declaration. If the set accessor is omitted them the property is read only, and code that refers to this property will only compile if it attempts to read the property, not if it attempts to write to it:

class SomeControl
{
public int Width
{
get // signature is implicitly int set();
{
return width;
}
}
// etc.
}

// later on
// MyControl is a reference to a SomeControl instance
int HalfWidth = MyControl.Width; // OK
MyControl.Width = 800; // WRONG. Width is read-only

Supplying a property with only a get accessor (a read-only property) is the usual way that you can provide read-only access to a member field.

Conversely, if the get accessor is omitted the property is a write-only property. Although this is legal syntax it is not regarded as good programming style. Write-only properties should normally be implemented as methods. Just as with read-only properties, if you attempt to access an accessor that doesn't exist, this will be picked up as a compilation error.

Indexers

An indexer is very similar to a property, except that instead of being dressed syntactically to look like a field, an indexer has the effect of allowing its containing class to be treated as if it was an array, with the indexer being the method that gets called when you attempt to access an element of the array.

Calling Indexers

As an example, if a class, Vector has an indexer assigned which takes a long as a parameter, and allows access to its ( double ) components, then we could write.

Vector SomeVector = new Vector(); // assume constructor adequately
// initializes Vector
SomeVector[0] = 40.0;
double Y = SomeVector[1];

Although we are used to thinking of arrays as being accessed using an integer data type, indexers may be defined to take any data type as the parameter. For example, components of vectors are, customarily, referred to as X, Y and Z. If we wished to be able to access components of the vector using the names of the components as indexes, then we can define an indexer which takes a string as its parameter. Then we would be able to write this.

SomeVector["X"] = 40.0;
double Y = SomeVector["Y"];

It is also possible to define indexers that take more than one parameter. This is analogous to a rectangular multidimensional array. For example if you define a Matrix class you would probably want to be able to access its elements (which we will assume are of type double ) by representing the matrix as a 2D array Then you could write

Matrix SomeMatrix;
// initialize SomeMatrix
double Element00 = SomeMatrix[0,0];
SomeMatrix[1,2] = 10.0;

In both of the above examples the return type from the indexer is a double . In fact, the return type can be any data type you wish.

Defining Indexers

An indexer is defined using the same syntax as for a property, except that we have to specify the parameters. For example, an indexer that takes one long parameter and returns a string would be defined like this.

   public string this [long index]
{
get // implicitly returns a string and takes one parameter, index
{
// code to return value
}
set // implicitly returns a void and takes two parameters, value (a
// string) and index (a long)
{
// code to set value
}
}

Note that in this code sample, the name of the parameter is up to you. We picked index in this case, but that is just a variable name - it's not a keyword, unlike the implicit parameter to the setter, which is always named value .

We can use our Matrix example to demonstrate a full implementation of an indexer that takes two long s as parameters and returns a double . We'll assume here that the Matrix indexer simply wraps around an array field of type double[,] .

class Matrix
{
private double [,] Elements; // assume this gets initialized
// somewhere

public double this [long row, long column]
{
get // implicitly returns a double and takes two long params
{
return Elements[row,column];
}
set // implicitly returns a void and takes two long and one
// double parameter
{
Elements[row,column] = value;
}
}

As with properties, the get accessor method must return the same data type as the return type of the indexer itself, while the set accessor must return void and also takes an implicit parameter, value , which will be the value passed to the indexer as the right hand side of an expression. Accessors also implicitly take the same parameters as explicitly defined for the indexer ( row and column in the above example).

Also, as with properties, it is permissible to omit one of the accessors from the definition of the indexer, in which case the indexer will be either read-only or write-only.

As with methods, it is possible to provide as many overloads of an indexer as you wish, with the proviso that the overloads must be identifiably different in their signatures. The compiler will choose the closest matching overload in each given context.

Constructors

Constructors are similar to methods, except they do not have a return type and are really intended to initialize class instances when they are declared.

Calling Constructors

Constructors are always called with the new operator. The syntax for calling constructors was actually given earlier in the appendix, in the section about initializing variables:

int NCustomers = new int();

The above code shows the explicit syntax for calling a constructor for the predefined data type, int - this constructor takes no parameters.

Since int is a predefined data type, the constructor is already defined. However, if you define your own classes or structs, then you have the option to define your own constructors, with or without parameters.

Although we remarked that constructors are really intended for initialization, there is nothing wrong with calling the constructor against a variable after it has been initialized. For value types, this will have the effect of resetting all fields to the values specified in the constructor. For reference types, it will have the effect of creating a new instance of the class on the heap, and reassigning the reference the constructor is called against to the new type.

int NCustomers = 50
// do stuff with NCustomers
NCustomers = new int() // example with value type.
// Resets NCustomers to zero

Customer NextCustomer = new Customer();
// do stuff with NextCustomer
NextCustomer = new Customer(); // example with reference type
//Creates new Customer instance

For the predefined data types as shown above, there is little point in using the constructor syntax since is easier to write, for example, NCustomers = 0 in the above case. However, you may find this technique useful for your own classes and structs.

Defining Constructors

You do not have to define constructors for your classes and structs. If you do not do so then the compiler will automatically supply a default constructor for each class or struct that you define. The default constructor will work by simply zeroing out those member fields which have not been explicitly initialized in the class definition, and will take no parameters.

The rules for what constructors you can add and the effect of this on the default constructor are slightly different for classes and for structs:

  • For classes you can replace the default constructor by supplying your own constructor that takes no parameters.
  • For structs, you cannot replace the default constructor. It is considered a syntax error to do so.
  • For both classes and structs you can supply as many overloads that take parameters as you wish to the constructor - provided these overloads have different signatures (in other words the number and type of the parameters are different for each constructor).
  • For classes, if you supply any constructors of your own, then no default constructor will be generated. This means that if you supply parametered constructors but no parameterless constructor, then it will not be possible to instantiate the class without specifying parameters.
  • For structs, the default constructor is always present, even if you add your own constructors.

The syntax for defining constructors is very similar to that for defining methods, except that the constructor does not have a return type - the thing implicitly returned from a constructor is the instance that has just been created. The constructors are distinguished from methods in the class definition because they have the same name as the name of the class.

As an example of a user-defined constructor, we'll assume we want to write an Employee class, where employees must have a name that must be supplied when the Employee is instantiated. Hence we might write this.

class Employee
{
private string name;

public Employee(string name)
{
this.name = name;
}

This example illustrates the most common use for parametered constructors: to store the values of the parameters in member fields - but in fact you can place any legal code in the constructor.

With the above example, you can then instantiate an employee like this:

Employee TheBoss = new Employee("Mr Gates");

However, the following code would fail, because by supplying a user-defined constructor, we have prevented the compiler from supplying a default one.

Employee NewRecruit = new Employee(); // compilation error -
// no no-parameter constructor

If you want to be able to construct employees like that, then you would need either explicitly supply a default constructor, or not supply any constructor of your own at all.

class Employee
{
private string Name;

public Employee(string name)
{
Name = name;
}


   public Employee() // now the above code will work
{
}

By contrast, if Employee had been declared as a struct, there would always be a default constructor, so it would always be possible to write:

Employee NewRecruit = new Employee(); // OK if Employee declared as struct

Calling Constructors from Other Constructors

It is acceptable for one constructor to call another constructor, but you must do so using a precise syntax known as a constructor initializor . The usual reason for calling other constructors is in order to supply default values for a constructors without duplicating the code in each constructor. For example, suppose it is possible to define our Employee class in the one of two ways:

  • Supplying the name of the employee
  • Supplying the name and salary of the employee

Where the salary, if not supplied, is assumed to be 20,000 dollars. This could be implemented using the following code.

class Employee
{
private string name;
private decimal salary;

public Employee(string name)
: this(name, 20000M)
{
// if you want to do anything else here you can do so
}

public Employee(string name, decimal salary)
{
this.name = name;
this.salary = salary;
}

From this we see that the syntax for the constructor initializor consists of a colon, followed by the keyword this followed by any parameters to be passed to the other constructor. There is no semicolon after the constructor initializor. The compiler will determine from the number and type of these parameters which other constructor is to be called, according to the usual rules for method overloads.

It is important to understand that the constructor initializor is executed before the constructor body. This has the obvious implications for which variables you can rely on having been initialized where.

It is alternatively possible to call a base class constructor in the constructor initializor. To do this you replace the keyword this with the keyword base in the above code.

For example suppose the class Manager is derived from employee, and you wanted the constructor for Manager to be roughly the same but with an additional parameter, bonus . You might implement the Manager constructor as follows.

class Manager : Employee
{
decimal bonus;

public Manager(string name, decimal salary, decimal bonus)
: base(name, salary)
{
this.bonus = bonus;
}

It is not possible to place more than one other constructor in the constructor initializor. Nor it is it possible to place anything other than a call to another constructor of this class or a constructor of the base class in it. In other words, the constructor initialization list, if specified, always has one of the following two formats:

   : this(/*parameters*/)

   : base(/*parameters*/)

If you do not specify the constructor initializor, then the compiler will supply one for you, which calls the base no-parameter constructor. Hence, for any constructor the following two definitions are equivalent.

class MyClass : MyBaseClass
{
public MyClass()
: MyBaseClass()
{

class MyClass : MyBaseClass
{
public MyClass()
{

A consequence of this is that, since the constructor initializor is always executed before the constructor body, the constructors of any class will always be executed in the order of going down the hierarchy. In other words, the System.Object constructor will always be executed first, followed by the constructor(s) of the first derived class, followed by the next class, etc. until the constructor(s) for the class we are actually instantiating is executed last.

Note that the compiler will check that each constructor has access to the constructor in its constructor initializor. This means that if any constructor is declared as private, it cannot be called from any derived classes. For example:

class MyBaseClass
{
private MyBaseClass()
{
}
// etc.

}

class MyClass : MyBaseClass
{
public MyClass() // WRONG. This implicitly calls base(),
// but that constructor is private hence not visible
// in the derived class. Will give compilation error
{
// etc.

Static Constructors

Besides the instance constructors, it is also possible to define one static constructor in each class. The syntax looks like this.

class MyClass
{
static MyClass()
{
// do stuff here
}

The static constructor is used to initialize any static or static read-only variables fields. A static constructor is guaranteed ever to only be executed once. You cannot determined precisely when it will be executed, but it will be before any other code actually instantiates objects of the class or calls any static members. In other words, the static constructor will be called before any other code attempts to use the class.

A typical example of the use of static constructors was presented earlier, in the section of read-only fields:

class Session
{
public static readonly ServerIPAddress;
protected readonly SessionID;

static Session()
{
ServerIPAddress = GetMyIPAddress();
}

Like other static methods, it is not possible to access any instance data or fields from the static constructor. Syntactically, the static constructor never takes any parameters, does not have a return type, and does not even have an access modifier. It would be meaningless to describe the static constructor, for example, as public or protected, since the static constructor is called only by the .NET run-time, never by any other C# code).

Destructors

Where constructors are called automatically to initialize an object, destructors are designed to provide the ability for an object to clean up any resources it has been using when it is destroyed.

In general, most classes will not need destructors, since any managed memory being used by a reference object will automatically be removed when the object is garbage collected, while memory on the stack that is occupied by a value object will be freed when the object goes out of scope. The only objects for which destructors are needed are those that maintain external resources that are not managed by the .NET runtime - for example connections to files or databases or old-style Windows handles.

Defining a destructor is more complex than defining a constructor, since a well-designed destructor will be implemented in two stages. On the other hand, as just remarked, it will not be required at all for most classes.

Destruction of structs works differently to destruction of classes. Much of the contents of this section apply only to classes. You may derive structs from IDisposable and implement a Dispose() method on them, but C# does not permit destructors themselves to be defined for structs.

In order to have a class clean up its resources, you should:

  • Derive the class from the IDisposable interface, and provide an implementation of this interface's Dispose() method.
  • Implement the destructor itself.

The idea is this:

  • Any client code that uses the class should call Dispose() explicitly when it has finished with a given object. Dispose() should be implemented, for example, to free any external resources that are being used by the object. Calling Dispose() means that these resources will be freed promptly.
  • The destructor itself will be called automatically when the garbage collector destroys the object. This serves as a backup mechanism just in case the client code, for whatever reason, does not call Dispose() .

The reason for treating the destructor as purely a backup rather than as the primary mechanism for cleaning up resources is twofold: Firstly, there is no guarantee about precisely when it will be executed, because it is called by the garbage collector, and there is no guarantee about when the garbage collector will run. This means that if we relied solely on the garbage collector to free up external resources then those resources would most likely not be freed until considerably later than the time at which they become no longer needed. Secondly, executing a destructor does affect the performance of the garbage collector. Also, because of the way the garbage collector has been implemented, objects that have destructors will not actually be destroyed until the second time that the garbage collector is called - so their lifetimes will be longer than needed. Because of this, it's desirable to avoid calling destructors where possible, and in fact a method has been provided by one of the .NET base classes, System.GC.SuppressFinalize() , which should be called from the Dispose() method and which notifies the .NET runtime that there is now no need to run the destructor for a given object.

Disposing of objects successfully requires cooperation between the object and the client code that instantiates the object. In the next section we'll see what the client code has to do, and in doing se we'll see how the destructor is actually defined.

Disposing Objects

Client code should be implemented to call Dispose() on an object that implements this method when it has finished with the object. Client code cannot call the destructor itself - that is called by the garbage collector.

Suppose we have a class, FileConnection , which contains a link to an external resource. Then we would use it like this.

FileConnection MyObject = new FileConnection();
// use MyObject

MyObject.Dispose();
// should not use MyObject again

However, C# provides an alternative syntax which will result in Dispose() being called automatically:

using (FileConnection MyObject = new FileConnection)
{
// code that uses MyObject
}

If the using statement is applied in this way, the variable defined in the parentheses will be scoped to the following block statement. When execution flow leaves the block, by whatever means ( return , break , exception etc.) execution will simply hit the closing brace. MyObject.Dispose() will be called implicitly.

Defining Destructors

The destructor itself is defined using the following syntax:

class MyClass
{
~MyClass()
{
// cleanup code here
}

The compiler recognizes a destructor because it has the same name as the containing class, but is preceded by a tilde (~). The destructor must not take any parameters, or have any return type or access modifiers.

As we have said, you should also derive the class from IDisposable and implement the Dispose() method, which takes no parameters and returns a void . In general, the Dispose() method will contain the same clean up code as the destructor, as well as calling GC.SuppressFinalize() to prevent the destructor from being called subsequently.

class MyClass : IDisposable


{

   ~MyClass()

   {

      // cleanup code here

   }

   public override void Dispose()
{
// cleanup code here
GC.SuppressFinalize(this);
}

From this code we see that SuppressFinalize() takes one parameter - an object reference that indicates the object whose destructor is no longer required.

Other Variations

The above sequence is not the only possibility, but it represents the usual recommended way of implementing a disposal sequence. In particular any of the stages just described can be omitted. However if you omit to implement Dispose() performance will suffer since disposal will always happen via the destructor. If you omit to implement the destructor itself, then there is no safeguard against a badly behaved client forgetting to call Dispose() . (This is however the only real option for structs for which you cannot define a destructor, since structs are not garbage collected).

You could also theoretically implement Dispose() without deriving from IDisposable , but you don't save any coding effort that way, and it means that clients cannot use the interface to dynamically find out if your class implements Dispose() . A side effect of this is that attempting to use the using syntax to automatically have Dispose() called for such an object will cause a compilation error since this syntax relies on the object implementing IDisposable .

In some situations you may want to be able to free external resources, while keeping the option of reopening them later. This might be the case for example for a class that implements a database connection, where you might wish to close the connection for periods when you are not accessing the database. For this purpose, Microsoft recommend that you implement a second method Close() , which should return a void and be implemented in the same way as Dispose() , except that it will not call GC.SuppressFinalize() .

Operator Overloads

C# allows you to define overloads of operators. The table below shows which operators may be overloaded. You should note that there are some operators that cannot be overloaded because their meanings are fixed (for example = ), while other operators cannot be overloaded but are constructed from other operators that can (for example the compiler always breaks down the addition-assignment operator, += into its components ( + and = ). Hence if you overload the addition operator ( + ) for a particular class or struct, then the corresponding addition assignment operator will automatically have its meaning defined by your + operator overload.)

The following operators may be overloaded

  • The binary arithmetic operators: + - * / %
  • The unary arithmetic operators: +, -
  • The prefix versions of the unary operators ++ and - (not the postfix versions)
  • The comparison operators: != , == < , <= , > , >=
  • The binary bitwise operators: & , | , ^ , >> , <<
  • The unary operators: ! , ~
  • The operators true , false

The following operators cannot be overloaded.

  • The arithmetic assignment operators *= , /= , += , -= , %= (These are worked implicitly by combining the corresponding arithmetic operator overload with the assignment operator).
  • The bitwise assignment operators &= , |= , ^= , >>= , <<= (These are worked implicitly by combining the corresponding arithmetic operator overload with the assignment operator).
  • The Boolean operators && , || (these are worked out by the compiler from the corresponding bitwise operator overloads).
  • The postfix increment and decrement operators, ++ and -- . These are worked out implicitly from the overloads to the prefix ++ and -- operators.
  • The assignment operator, = . The meaning of this operator in C++ is fixed.
  • The pointer-related operators, unary * , unary & and ->; the meanings of these operators is fixed.
  • The ternary operator: ? :

It is also not possible to overload the scope resolution operator ( . ), or any of the forms of brackets () , [] and {} .

Note that there are a couple of restrictions for overloading the comparison operators: Any overloads of them must return bool , and they must be overloaded in pairs: If you overload == , you should also overload != . Similarly for >= and > , and for <= and < . Also, if you overload == then you are strongly recommended to override the Equals() method for your class (all classes inherit this method from System.Object ). If you do not, the compiler will give a warning.

Using Operator Overloads

Once you have overloaded an operator then you simply use it in the normal way for that operator. For example supposed you have a Vector class. Mathematically, it is possible to add vectors, the result being another vector. Assuming you define an overload for + that takes two Vector instances as its parameters and returns a Vector , you would be able to write:

Vector V1, V2, V3;
// initialize V1 and V2 here
V3 = V1 + V2;

Defining Operator Overloads

Operator overloads are treated syntactically as static member methods from the point of view of defining them. The name of the method is the keyword operator , followed by the symbol used for that operator. So, for example, to provide the overload for addition for Vectors that will allow the above code to compile, we would write

class Vector
{
public static Vector operator + (Vector lhs, Vector rhs)
{
// code to add lhs and rhs and return a Vector here
}

Note that this operator overload must takes two parameters, since addition is defined as a binary operator. In general, when overloading an operator you must supply the same number of parameters that the operator normally takes. However, these parameters can be of any type you wish, as can the return type (with the exception that the return type for all overloads of the comparison operators, == , != , < , >= , <= , and > must be bool ).

The compiler treats operator overloads in the same way as methods when it comes to figuring out which overload is the one required in a given context. It can choose between any of the overloads that you supply, and all the other predefined overloads for each operator. For example, supposed the compiler encounters this code.

int X1, X2, X3;
double D1, D2;
Vector V1, V2, V3;
// initialize all these variables

V3 = V2 + V1; // OK
D1 = X1 + X2; // OK
D2 = V1 + X3; // WRONG - no suitable overload

Our first attempt to use the addition operator in this code is when we add V1 to V2 , storing the result in V3 . Here, the compiler will see that the two parameters passed to the operator ( V1 and V2 ) are both Vector instances, as is the result. Hence, the compiler will look for an overload of this operator that takes two Vector s as its parameters and returns a Vector . Since we have provided one, there is no problem.

Next, we attempt to add two int s and store the result in a double . Again, this is no problem, since the compiler knows how to add two int s. In this case, the predefined overload of + also returns an int , but again, that is no problem because the compiler can implicitly convert an int to a double.

The final line of this code will cause compilation error. In this case, we're trying to add a Vector to an int and store the result in a double . The compiler will therefore look for an overload of + that takes a Vector as its first parameter and an int as its second parameter. Since we have not provided one, the compiler will next look to see if any casts on Vector or int are available, that it can some how use to do some data conversions on the parameters, that will result in data types for which a + overload is available. However, since we haven't supplied any casts to convert Vector to or from anything else, the compiler will not find any way of doing this either. At this point it will give up and generate a compilation error.

User-Defined Casts

Earlier in this appendix, when we covered variables, we described how C# allows data types to be converted to other data types using casts. A cast can be explicit or implicit.

int X1 = 10, X2 = 20;
double D2 = X1; // implicit;
int X2 = (int)D2; // explicit

For the predefined data types, implicit casts are available where it is not possible for anything to go wrong with the operation. If, however, it is possible in principle that the conversion might raise an exception or cause an overflow or loss of data (as is the case for a double -to- int conversion), then only an explicit cast is available.

The same principles apply for casts between your own data types.

You may define a cast to convert any data type to any other data type, but the following restrictions apply.

  • The cast must be marked as either implicit or explicit, to indicate how you intend it to be used.
  • It is not possible to define casts between classes where one class is directly or indirectly derived from the other. This is because casts already exist in that case.
  • The cast must be defined as a static member of either the source data type or the destination data type.

The final condition means that you can only define the cast if you have access to the source code of least one of the data types. This effectively means that you can only define casts for those classes and structs that you have written.

The syntax for defining a cast is as follows. In this case we assume that we have two classes, Gif represents any Gif file. StillGif represents any Gif file, provided that it does not contain any animated images. We also assume that Gif and StillGif are not in the same inheritance tree. Now from the definition, a cast from Gif to StillGif may fail (if the Gif contains animations) and hence if good programming practice is being followed should be explicit, while a cast the other way round will always succeed, and so should be implicit.

We define the explicit cast as follows.

   public static explicit operator StillGif (Gif source)
{
// code to do conversion. Must return a StillGif
}

While the implicit cast would look like this

   public static implicit operator Gif (StillGif source)
{
// code to do conversion. Must return a Gif
}

From this we see that a cast is defined in a similar way to a method, except that the 'name' of the method consists of the keyword operator followed by the name of the destination type. The cast must take one parameter, which is the source data type instance to be converted.

Note that although not a syntax error, it would be considered extremely bad programming practice for the implementation of the cast to modify the source value passed in as a parameter in any way. The developers who write client code would not be expecting this behavior.

For the above examples, either cast may be defined in either the Gif or the StillGif class definition (though not both).

If you define a cast between a custom class or struct, and a predefined data type, then the cast must be placed in the definition of the custom class or struct.

Interfaces

An interface is similar to a class or struct, but it does not contain any implementations for any of its methods etc. The real intention of an interface is that it provides a way for a class or struct to declare that it supports certain methods, properties, events or indexers. For this reason an interface is often regarded as a contract. An interface itself is defined using the following syntax.

public interface IEnumerable
{
IEnumerator GetEnumerator();
}

For this example we have taken an interface from one of the base classes, System.Collections .

This code shows that the syntax for defining the members of an interface is the same as that for defining the members of a class or struct, except that, because the methods do not have any implementations, the name of each method, etc. is simply followed by a semicolon. Also notice that they do not have any access qualifiers and are not declared as virtual etc. It is regarded as the job of the class that implements an interface to add any such modifiers to the methods. The interface should simply indicate the return types and parameters, and, as indicated in this sample, it is perfectly acceptable for an interface reference to be returned as a parameter. This particular interface only defines one method, but most interfaces will define more than one.

IEnumerable is required to be implemented by collections, and in particular, the foreach loop requires this interface to be implemented on the class it enumerates over.

An interface may inherit from one or more other interfaces, in which case it is considered as containing all the methods of the base interfaces as well as those defined explicitly in the derived interface.

public interface ICollection : IEnumerable
{

// Properties
int Count { get; }
bool IsSynchronized { get; }
object SyncRoot { get; }

// Methods
void CopyTo(Array array, int index);
}

The ICollection interface defines GetEnumerator() (inherited from IEnumerable ), as well as its own methods. Classes may implement this interface, which means that they are declaring they support more sophisticated collection facilities than those offered by IEnumerable .

A class inherits implements an interface by deriving from it. Details of this are given in section about classes.

class MyCollection : ICollection
{
// implementations of members. Must include implementations of
// all ICollection (and IEnumerable) members.
}

A class that implements an interface can be cast to that interface, and methods called against the interface reference. In general, you use interface references in exactly the same way that you use class references, except that you cannot instantiate an interface - you get an interface reference either by casting a class reference or as the return value from a method. Both techniques are illustrated here.

MyCollection SomeCollection = new MyCollection();
// initialize SomeCollection
IEnumerable Enumerable = (IEnumerable) SomeCollection;
IEnumerator TheEnumerator = Enumerable.GetEnumerator();

// this would work too.
IEnumerator OtherEnumerator = MyCollection.GetEnumerator();

Enumerations

An enumeration is strictly speaking an instance of a struct that is derived from the base struct System.Enum . This base class is intended for classes that simply contain closed lists of values. To make using such lists of values easier, the C# language uses the enum keyword, which wraps a special syntax around the System.Enum class, in much the same way that string wraps a special syntax around System.String .

The syntax for defining and enumeration looks like this

public enum Platform {Win30, Win31, Win95, Win98, WinMe, WinNT351, WinNT4,
Win2000, WinXP}

By defining an enumeration, you are in effect defining a new data type. Hence you may only define a new enumeration either as a top-level object, in a namespace or as a member of a class or struct. Once an enumeration has been defined, you may declare and use instances of it like this:

Platform OperatingSystem;
OperatingSystem = Platform.WinMe;

Since you can never write code inside an enumeration, you will always be referring to the name of an enumerated value with the syntax <EnumName>.<Value> .

By default, enumerated values are stored as int s. The first enumeration value in the list has the value 0 and other values are numbered consecutively upwards 1,2,3 etc. However, you can change this order by specifying a particular value for any enumerated value.

public enum Platform {Win30, Win31, Win95=95, Win98, WinMe, WinNT351,
WinNT4, Win2000, WinXP}

In this case, we have assigned the value 95 to Win95. Numbering will then carry on from this point, so, for example, with the above code, Win98 will have the value 96, WinMe 97 etc.

It is also permitted to change the underlying data type that is used to store any enumeration. You may choose any of the integral data types ( sbyte , byte , short , ushort , int , uint , long , ulong , though not char ) and do so by indicating the data type after the enum keyword. For example, to store this enumeration as ulong :

public enum Platform : ulong {Win30, Win31, Win95=95, Win98, WinMe,
WinNT351, WinNT4, Win2000, WinXP};

The ToString() method has been overridden in System.Enumeration , so that it returns a string representation of the value. For example the code

string OsName = Platform.Win31.ToString();

will place the string " Win31 " in OsName .

A static Parse() method that takes a string as a parameter has also been implemented in System.Enum , which enables you to do this:

Platform Os = (Platform)Platform.Parse(typeof(Platform), "WinNT4");

This will result in Os containing the value, WinNT4 .

Explicit casts are available to convert enumerated values to the corresponding numeric value.

int OsAsInt = (int)Os;

If Os has the value WinNT4 , this will place the value 99 in OsAsInt .

Delegates

A delegate is an instance of a class that is derived from the .NET base class, System.Delegate . System.Delegate is designed as a base class for any classes that hold references to methods. Just as with enumerations, C# provides a special syntax to wrap around the derived classes, which makes it more convenient to manipulate method references.

Defining a Delegate

A delegate is defined using the following syntax.

// declares a delegate that can wrap any method that returns double and takes
// an int parameter
public delegate double DoubleOp(int value);

// similarly declare a delegate for a method that returns a string and
// takes no parameters
public delegate string StringOp();

The definition consists of the delegate keyword followed by the something that looks like the signature of a method. The name of this method will be the name of this class of delegate. The names given to the parameters in the signature are dummy names and do not have any significance, they are not actually used anywhere.

Note that the above code should be viewed as defining a data type - it does not actually declare any variables.

Declaring a Delegate

Instances of delegates are declared in the same way as instances of other data types.

StringOp GetAString;

When actually instantiating the delegate, you need to pass one parameter to the constructor, which is the details of the method that the delegate is going to refer to. For static methods, this means supplying the name of the class and the name of the method, while for instant methods, this involves supplying the name of the instance and the name of the method. The syntax is the same as the syntax for calling the relevant method, except that the parameter list is omitted. For example, suppose we have a class, Employee , which represents details of a company employee and contains the following methods.

class Employee
{
public string GetName()
{
// etc.
}

public static string GetCompanyName()
{
// etc.
}

We can declare a variable that is a delegate instance able to refer to each of these methods as follows.

// somewhere in a class or method

// declare variable
StringOp GetAString;

// initialize it
GetAString = new StringOp(Employee.GetCompanyName);

Notice the parentheses after the method name GetCompanyName are omitted. Adding them would result in the code actually calling the method - we just want to indicate which method we are referring to.

The type of the delegate is entirely defined by the signature of the method - that is to say the return type and the number and types of its explicit parameters. Whether the method is static, and what type of class the method is defined against, is irrelevant in this context. This means that a particular delegate instance can be initialized to any method in any class provided it has the appropriate number and types of parameter, and the appropriate return type. The above code initializes the GetAString variable to wrap round the static GetCompanyName() method, but we could equally well later set it to refer to the GetName() method instead -except that because GetName() is an instance method, we need an instance of Employee too.

Employee TheManager;
// assume we initialize TheManager to point to an employee instance

GetAString = new StringOp(TheManager.GetName);

Similarly, we could, if we wished, set GetAString to refer to a method of an instance of a completely different class - provided that the method has the same signature.

Using a Delegate

Once a delegate has been initialized to refer to the method, you can actually called the method by simply writing the name of the delegate followed by the parameters to the method, in brackets just as if the delegate was the method. This is known as invoking the delegate.

Console.WriteLine(GetAString());
// results depend on what method GetAString was referring to.

However, what makes delegates really powerful is that, because a delegate is also regarded as a variable, its value can be passed around. This is the way that C# allows for details of methods to be passed between the other methods.

// method that takes a delegate
// this method at some point needs to obtain a string and it doesn't care what
// method it uses to get it - you tell it what method to use
// by passing it as a parameter in a delegate
void ProcessAString(StringOp op)
{
// at some point in the method .
string TheString = op();
// process the string
}

Multicast Delegates

A delegate is automatically a multicast delegate if the method to which it is defined to refer to returns a void . The significance of multicast delegates is that they can simultaneously refer to more than one method call. Methods may be added to multicast delegates using the addition and additional assignment operators.

For example, suppose we have a delegate defined as follows. This delegate is intended to display a message to the user, but we don't specify how it is to be done (message box, writing to console, etc.) - the method passed in to the delegate will determine that.

public delegate void ShowMessageOp(string value);

Now this delegate is a multicast delegate because it returns void .

We can set up a delegate instance that, when invoked, will display the message on the console:

ShowMessageOp DisplayIt = new ShowMessageOp(Console.WriteLine);

We can now add a second method to the delegate, which displays a message with a message box. This would normally be done with by calling the Show() method of the.NET base class, System.Windows.Forms.MessageBox . However this method doesn't return void and so doesn't fit the requirements for a multicast delegate. So here we'll assume we've defined our own method, ShowMessageBox , as a static method of a class OurUtilities , which has the correct signature and returns void :

DisplayIt += new ShowMessageOp(OurUtilities.ShowDialog);

Now if DisplayIt is invoked, it will call both methods in turn, so displaying the same message in two different ways:

DisplayIt("This message will get displayed twice");

It is also possible to remove methods from multicast cast delegates by using the subtraction and subtraction assignment operators.

DisplayIt -= new ShowMessageOp(OurUtilities.ShowDialog);

Or alternatively

ShowMessageOp NewDisplayOp = DisplayIt -
new ShowMessageOp(OurUtilities.ShowDialog);

When a multicast delegate is invoked, each method that it refers to is invoked in turn, each one being passed for the same set of parameters.

Events

An event is a multicast delegate that has either the following signature.

void EventHandler(object sender, System.EventArgs e)

The second parameter can be a reference to a class derived from System.EventArgs .

The significance of events is that they are used in the standard Windows event callback mechanism by which applications receive notifications of interesting events (such as actions by the user concerning the mouse or keyboard). Because of the importance of this mechanism, C# uses the event keyword to make using events syntactically slightly easier. In the above signature, sender is intended to provide a reference to the object that raised the event. The e parameter, on the other hand, contains information relating to the event (for example, which key was pressed, where the mouse was when it was clicked), although in order to provide this information, the EventArgs class must be derived from by a specialized class that can supply the information relating to a particular type of event.

In order to define an event, the code that raises the events must define the relevant delegate type. For example, suppose an application implemented a class, NetworkAnalyzer , which monitored the network for FTP requests, and so was able to notify other code when FTP requests came in. It might define the following delegate.

class NetworkAnalyser
{
public delegate void FTPRequest(object Sender, FTPEventArgs e);

Where FTPEventArgs is a class we have defined that contains information related to an FTP request (source IP address etc.), and which is derived from System.EventArgs . We might for example define FTPEventArgs like this:

class FTPEventArgs : EventArgs
{
private string sourceIPAddress;
private string ftpCommand;
private StringCollection ftpCommandParameters;

public FTPEventArgs(string sourceIPAddress, string ftpCommand,
StringCollection ftpcommandParameters)
: base()
{
// etc.
// also define public readonly properties for the fields.
}

This definition stores the IP address of the computer issuing the request, and the command and parameters sent, and so could make this information available to event handlers. It uses StringCollection , a .NET base class defined in System.Collections.Specialized , which implements a collection of strings.

We would then declare the event as an instance of the appropriate delegate.

public event FTPRequest OnFTPRequest;

A client that wishes to be notified of FTPRequests would then implement a handler:

protected void FTPRequestHandler(object sender, System.FTPEventArgs e)
{
// code to handle this request
}

It would also notify the event that it wishes to be notified of any other events using the normal syntax for adding methods to multicast delegates:

NetworkAnalyser Analyser;
// initialize Analyser so it refers to the Network Analyser instance

// now add our handler to the event
Analyser.OnFTPRequest += new Analyser.FTPRequest(this.FTPRequestHandler);

Exceptions

Exceptions are the normal means by which C# code should handle exceptional error conditions.

In order to handle exceptions correctly you divide code into three types of blocks known respectively as try , catch , and finally blocks.

try
{
// normal code
}
catch (SomeException e) //SomeException is an exception class defined
{ // in your code
// error handling code
}
catch (SomeOtherException e)
{
// error handling code
}
finally
{
// cleanup code
}

This code illustrates the principles, but there are some variations. It is permitted to omit the finally block, and there may be as many catch blocks as you wish, but each one should take a parameter of a different type. However, the parameter to each catch block must be a reference to a class derived directly or indirectly from the .NET base class, System.Exception . Classes so derived are known as exceptions or exception classes. Note that it is also permitted to omit all catch blocks - in this case the construction is not used to trap any errors, but simply to provide a way of guaranteeing that code in the finally block will be executed when execution flow leaves the try block.

  • The try block contains the normal code of your program, in which an error condition may arise.
  • Each catch block contains code that handles a certain type of error condition.
  • The finally block, if present, contains code to perform any clean up that is required after leaving the try block.

The idea is that if some unexpected error condition arises, the program should respond in the try block by throwing an exception. Throwing an exception is accomplished by the throw statement.

throw new SomeException("An error has occurred");

The throw statement takes one parameter, which syntactically appears straight after the throw keyword. This parameter is the exception instance that is being thrown. There is no need for any enclosing brackets surrounding it.

As soon as execution hits a throw statement, it immediately leaves the try block and enters the first available catch block that can handle that exception. In the process of leaving the try block, any variables that were scoped to the try block or to any methods called from within it will automatically pass out of scope. There is no limit to how deeply the code might have passed into further method calls after entering the try block. Execution will always smoothly exit from all of these immediately. The first available catch block is identified using the particular class of exception that was thrown. As noted above, each catch block takes a parameter which is an instance of a class derived from System.Exception . A catch block is able to accept a particular exception if its parameter is either of the same class as the exception or is of a base class of that exception. This is equivalent to saying that the reference in the catch block must be able to take the exception that was thrown as a referent.

When code leaves a catch block it automatically enters the finally block. If no exception is thrown in a try block, then eventually execution will simply pass out of the try block - in this case, control automatically then passes straight to the finally block. It does not matter how code exits from a try block, the finally block will always be executed. Afterwards, control usually proceeds to the next executable statement following the finally block.

As an example, consider this code. Assume a method is defined like this.

// this method expects Value to refer to something and Value2 to be > 10
void DoSomething(MyClass Value, int Value2)
{
if (Value == null)
throw new ArgumentNullException("Value was null in DoSomething()");
else if (Value2 <= 10)
throw new ArgumentException("Value2 was < 0 in DoSomething()");
// code for method
}

// later on
try
{
DoSomething(Value, Value2); // assuming these variables are defined and
// initialized
}
catch (ArgumentNullException)
{
// error handling code
}
catch (ArgumentException)
{
// error handling code
}
finally
{
// cleanup
}

In order to understand this code we need to be aware that ArgumentException and ArgumentNullException are two exception classes in the .NET base classes, and that ArgumentNullException is derived from ArgumentException . ArgumentException is used to indicate a general problem with a parameter passed to a method, while ArgumentNullException is used specifically to indicate that the argument passed on method was null but should have been referring to some object.

If DoSomething() is called with null passed as the first parameter, then an ArgumentNullException is thrown. Since the first catch block takes precisely this class of exception, this will be able to handle that exception, and therefore this catch block will be executed, followed by the finally block. If on the other hand DoSomething() is called with 0 passed as the second parameter, then an ArgumentException will be thrown. Execution flow will miss the first catch handler because that requires an ArgumentNullException , or class derived from it. However, the second catch handler takes an ArgumentException , and therefore can handle this exception.

Note that the order of the catch statements is important. For example, if we had coded this:

catch (ArgumentException)
{
// error handling code
}
catch (ArgumentNullException)
{
// error handling code
}

Then the second catch block will never be reachable because an ArgumentNullException would be picked up by the first catch block. The compiler would detect this and flag a compilation error.

If an exception is generated and no catch blocks within the program are able to handle it, then program execution terminates and the .NET runtime will handle the exception, which normally means displaying a message box complaining that the exception was not handled in your code. If you wish, you can ensure all CLS-compliant .NET-generated exceptions are caught by providing a catch handler with this signature.

catch (Exception)
{
// error handling code
}

This is because according to the rules of the .NET Common Language Specification, all exceptions must be derived from System.Exception .

You can extend this to cover any exceptions that are not CLS-compliant or are even generated outside the .NET environment altogether (and which will therefore probably not be derived from System.Exception ) by adding a catch block that takes no parameters to the end of your list of catch statements.

catch
{
// error handling code
}

Although this will trap any object that has been thrown, it has the disadvantage that you have no access to the exception object, and hence cannot find out any information about the cause of the error from it.

Nested Try Blocks

It is possible to nest try blocks inside each other.

try
{
// normal code
try
{
// normal code
}
catch (SomeException e)
{
// error handling code
}
finally
{
// cleanup code
}
}
catch (SomeOtherException e)
{
// error handling code
}
finally
{
// cleanup code
}

Nested try blocks listed in this way usually act independently. However, they interact if an exception generated inside the inner try block is not handled by any of the catch handlers of the inner try block. In this case, the search for suitable exception handler moves up to the catch blocks associated with the outer try block (in the process, the inner finally block will be executed but no other code inside the outer try block will be). So, assuming a suitable handler is found with the outer try block, the order of execution will be throw statement -> inner finally block -> outer catch handler -> outer finally block , after which execution continues at first statement after the outer finally block.

Throwing Exceptions from Inside Catch or Finally Blocks

In the scenario above, if an exception is thrown from within the inner catch handler or inner finally block then search for suitable handler starts with the outer catch blocks. The implication of this is that if an exception is thrown from inside one of the outer catch blocks or the outer finally statement, this will automatically be regarded as unhandled.

Usually, a C# syntax requires that a throw statement must specify an exception object. However, in the particular case in which an exception is being thrown from inside a catch block, it is acceptable to avoid specifying an exception object.

// inside catch block
throw;

In this case, the compiler will simply arrange for the exception currently being handled to be thrown again.

Exception Objects

The .NET base classes include an extensive class hierarchy of exception objects and it is permitted to define new ones appropriate to your code. The most usual constructor for an exception object is the one demonstrated in all the code samples above - it takes one string as a parameter. This string should describe the error condition and is available as the Message property of the exception object. Hence one very basic catch handler might read:

catch (SomeException e)
{
MessageBox (e.Message);
}

If you are throwing a second exception from within a catch block, it is common practice to string the two exception objects together. This is usually done by passing in the original exception as a new parameter, innerException , to the constructor of the new exception instance. The .NET base class exception classes feature two-parameter constructors that allow you to do this, and you are strongly advised to provide the same feature in any exception classes that you define.

catch (SomeException e)
{
// code
throw new SomeOtherException("What happened was etc.", e);
}

The usual reasons for throwing second exceptions are because something else went wrong while you were trying to handle the first exception or in order to provide additional information about the first exception.

Attributes

Attributes are objects that can be attached to items in C# source code to provide extra information to the compiler concerning those items. Items to which they can be attached include methods, classes, structs, enums, method parameters, etc.

They syntax of an attribute consists of a name, optionally followed by one or more parameters in parentheses, the entire attribute being enclosed in square brackets and placed immediately before the item to which it applies. For example:

[STAThread] // indicates threading model of main program thread
public static void Main() // STAThread takes no parameters
{

Or:

// indicates that following method and all calls to it should be ignored
// by the compiler unless the named preprocessor directive is present
[Conditional("Debug")]
public void WriteDebugInfo()
{

Some attributes are recognized by the compiler and thus affect the compilation process, as those in the above examples do. Other attributes simply cause metadata to be left in the compiled assembly, which can later be retrieved using the System.Reflection classes, and may be used, for example, for extended documentation purposes. Since attributes are actually implemented as instances of .NET base classes that are derived from System.Attribute , it is quite possible to define your own custom attributes - though obviously such attributes will not be recognized by the compiler, and so will not have any effect other than leaving metadata in the assembly.

Because it is possible to define further attributes, it is not possible to give a comprehensive list of the available attributes. However, some of the common ones include.

AttributePurpose
ConditionalPrevents a method from being compiled unless a named debug symbol is present
DllImportIndicates that a method is defined in a named external DLL instead of in source code or an assembly
FlagsIndicates that an enumerated list actually consists of bitwise flags that may be combined.
ObsoleteMarks a method or class as obsolete, causing either a compiler warning or error (depending on parameters passed to the attribute) to be raised if any code attempts to use the obsolete item.
STAThreadIndicates that a thread should run in an STA environment
StructLayoutAllows you to specify exactly how fields in a struct should be laid out in memory.

Preprocessor Directives

In addition to the C# statements, C# supports a number of special commands known as preprocessor directives. These commands do not translate into any executable code directly, but they provide extra instructions to the compiler concerning precisely how it should compile the source code.

Preprocessor directives have a different syntax from normal C# statements: They always begin with a # , and use carriage returns rather than semicolons to indicate the end of the directive. Because of this, a preprocessor directive should be the only command on a line (though you may add comments after it).

#define/#undef

The #define directive defines a symbol, which will be present for the duration of compilation. #undef removes the symbol. The symbol so defined may be used by the #if , #elif , and #else preprocessor directives, as well as by the Conditional attribute.

#define Debug
#undef EnterpriseVersion

The #define and #undef directives must appear before any actual C# code. Note that preprocessor symbols may also be defined by supplying the appropriate flags to the compiler at the command line.

#if/#elif/#else/#endif

These directives can be used to construct an #if block, which acts in a similar way to the if ... else if ... else statement of the C# language. However, whereas if indicates that statements should be conditionally executed, #if indicates to the compiler that certain statements should be conditionally compiled, depending on whether a certain preprocessor symbol has been defined.

#if Enterprise
// put code here that you want compiled for the enterprise version
// of your software
#elif Professional
// code here for the Professional version
#else
// code for the home version
#endif

You can use the && , || , and ! operators to perform simple logical operations on the preprocessor symbols:

#if Enterprise && Debug
// compile something if both these symbols are defined
#endif

You can also use the symbols true and false . A symbol is considered to be true if it is defined. Using this, you can test if a symbol is undefined:

#if Enterprise == false
// compile something if Enterprise symbol not present
#endif

#line

The #line directive modifies that line number and file name used to report compiler warnings and errors.

#line 200 "Main.cs" // line numbering from here reset to line 200,
// file reported as Main.cs

The main use of #line will be if you have any software that processes your source files before handing them to the compiler. This means that compiler error messages will report a file name and line number that matches up to the files generated by the intermediate software - and won't necessarily match up to the files you are editing. You can use #line to restore the match so that you can see which code is causing compilation errors or warnings.

#warning/#error

These directives respectively instruct the compiler to issue a warning, or to raise a compilation error.

#warning "Don't forget to finish the implementation of the CalculateInterest method!"
#if (Debug && Release)
#error "You have the Debug and the Release symbols defined"
#endif

#region/#endregion

These directives can mark a region with a comment.

#region Member Fields
int X;
string Message;
#endregion

The main significance of #region and #endregion is that smart text editors may be aware of them, and may be able to use them to format the appearance of your source code appropriately. For example the Visual Studio.NET code editor uses them to define areas of code that it can collapse.

Pointers and Unsafe Code

C# allows the use of pointers and pointer arithmetic, but only in blocks of code that have been specifically marked as unsafe.

Pointers are very similar to references in that they store the address in memory of where some data is stored. However, references are very safe to use because the syntax around them restricts you from doing anything other than accessing the data referenced as a class instance. Pointers are far more flexible because their syntax gives you access to the address itself. This means that you can perform arithmetic operations on the address, and explore or modify areas of memory selected by address rather than by reference to a named variable. This means that operations using pointers can be done with a very low overhead and very high-performance, because you can write your code at a very low level, with direct memory operations. On the other hand, all the type safety of C# is lost, and it becomes a very easy to write buggy code that corrupts data or areas of memory. It is for this reason that such code is restricted to unsafe areas.

You can mark either a single member of a class or struct as unsafe, or an entire class or struct as unsafe, by including the keyword unsafe in the definition of the member or of the class or struct definition.

public unsafe void DisplayResult(long *pData)
{

Or

unsafe class LinkedList // entire class is unsafe
{
ulong *Start; // only allowed in an unsafe class
ulong *End;

Declaring a method as unsafe means that pointers may be used anywhere in the implementation of that method, and its parameters may be of pointer data types. Declaring a class or struct as unsafe means that all members of that class are automatically unsafe.

Variables may be declared as unsafe if they are member fields, but not if they are local variables.

public unsafe int *pX; // OK if declared as a field

It is also possible to mark block statements within methods as unsafe:

unsafe
{
// do something with pointers
}

Pointer Syntax

A pointer to any value data type is declared by prefixing the name of the data type with an asterisk.

long *pL1, pL2;
double *pD;
MyStruct *pMyStruct;
int *pWidth;

Pointers can be declared to any value data type, but it is not possible to declare a pointer to a reference data type. This is because the garbage collector is not aware of the existence of pointers, and so may at any time delete class instances - which would clearly cause problems if a pointer is pointing to an instance that is deleted.

Just as for a reference, declaring a pointer does not create any instance of the data pointed to; it simply declares a variable that will hold the address. You'll normally declare the variable pointed to separately and then assign the address of this variable to the pointer. This is done using the address of operator which is denoted by the ampersand symbol, & .

int Width = 20;
int *pWidth;
pWidth = &Width; // pWidth now points to Width

This code could alternatively have been written

int Width = 20;
int *pWidth = &Width; // pWidth now points to Width

Pointers can be dereferenced using the pointer dereference operator, for which the symbol is the asterisk.

int CopyOfWidth = *pWidth; // initializes CopyOfWidth to 20.

Note that there is never any confusion as the meanings of * and & . When used in pointer operations they are unary operators, whereas, when used for the alternative meanings of arithmetic multiplication and bit wise AND , they are binary operators, and hence always appear in your source code between the names of two variables.

The syntax above is fine when declaring pointers to variables that are stored in the stack. However, value types may also be stored on the heap in some cases - for example, if they are member fields of a reference type, and in those cases the above syntax is not permitted. This is because data on the heap is subject to the garbage collector, and, even if not removed, may be moved to a new location on the heap if the garbage collector chooses to tidy it up. Instead, such pointers must be declared inside a fixed statement, which instructs the garbage collector not to move the class containing the member field(s) in question while the pointers remain in scope. This prevents bugs from occurring due to pointers containing out-of-date and incorrect addresses. Such bugs cannot arise with references, because the garbage collector is able to update references automatically when it moves class instances.

For example, consider this class:

class Vector
{
double X;
double Y;
double Z;

Although the class itself is a reference types, so that it is not possible to declare a pointer to a Vector , its members, X , Y , and Z . are all value types. Hence, you can legitimately declare a pointer to any of these fields. However, because these fields will be stored on the heap, C# requires that any pointers to them must be declared inside a fixed statement.

Vector MyVector;
// initialize MyVector

fixed (double *pX = &MyVector.X, pY=&MyVector.Y)
{
// statements
}

All variables declared in the brackets following the fixed keyword will be scoped to the statement or block statements associated with the fixed statement.

Just as for references, you can explicitly set a pointer to null to indicate that it doesn't point to anything.

   double *pD ;
pD = null;

Note however, that if you wish a pointer to be set to null, you must do so explicitly. For performance reasons, C# never initializes pointers implicitly on your behalf. Also, when dealing with pointers, the usual rules about not being able to access a variable before it is initialized are relaxed.

The sizeof Operator.

When dealing directly with memory addresses, it can be useful to know how much memory each instance of a data type occupies. This information is provided by the sizeof operator which takes one parameter, which is the name of a value data type (it does not handle reference types), and it returns the number of bytes in memory occupied by that is data type.

int X = sizeof(long); // returns 8
int Y = sizeof(float); // returns 4

Like pointers, the sizeof operator is only available inside unsafe methods or classes.

Pointer Arithmetic

It is possible to increment or decrement pointers, or to add integers to them. However, when performing pointer arithmetic, the compiler assumes that the basic unit of memory is the size of whatever data type a given pointer points to. Hence, for example, syntactically adding 1 to a long* pointer actually results in sizeof(long) being added to the pointer. Similarly, if you syntactically subtract 3 from a double* , then what will actually happen is that 3*sizeof(double) will be subtracted from the pointer.

double SomeDouble = 20.0;
double *pD = &SomeDouble;
pD -= 3; // subtracts 3*sizeof(double) = 24 from pD contents

If you are using this technique to navigate to memory locations of variables, you should be aware that variables are not always stored contiguously. And modern 32-bit machines are designed to access memory in 32-bit (4-byte) blocks. Hence for performance reasons, the .NET runtime always tries to store each variable beginning at the border between a 4-byte block. This means that gaps will exist in memory between any variables whose size is not a multiple of 4.

Stack-Based Arrays

It is possible in an unsafe code block to create high-performance stack-based arrays, using the stackalloc keyword. These stack-based arrays each consists simply of a block of memory on the stack, big enough to hold a specified number of instances of a given data type. These instances are accessed using pointer arithmetic, and there is therefore none of the overhead of having a full System.Array object hanging around. The s tackalloc operator takes two parameters, the name of the data type and the number of elements needed. It has an unusual syntax that looks like this.

double *pArray = stackalloc double [20];

The return value of stackalloc is a pointer to the start of the memory allocated. The above code will allocate 20 times sizeof(double) bytes of memory.

Accessing the elements of an array allocated with stackalloc may be done using pointer arithmetic. But C# provides an easier syntax using square brackets, which allows elements of stack based arrays to be treated syntactically in an identical manner to elements of full arrays.

double *pArray = stackalloc double [20];
pArray[0] = 30.0; // initializes first element of array
pArray[1] = 16.0; // initializes second element of array

The rule is as follows: When the C# compiler encounters an expression of the form p[X] , where p is a pointer and X is an integer type, it always expands this to the expression *(p+X) - which correctly gives the appropriate address for an array allocated with stackalloc . This special syntax applies to all pointers; they don't have to have been allocated with stackalloc .

Note that stackalloc does not initialize the contents of its memory. Also, no bounds checking is performed when you attempt to access its elements.

Thread Safety

For the most part, support for threads and thread safety is provided by the base classes in the System.Threading namespace, and is therefore beyond the scope of this appendix. However, C# does provide one statement, lock , which assists in thread safety by wrapping a mutual exclusion lock (also known as a mutex) around a given reference object for the duration of a statement or block statement. The mutual exclusion lock prevents any other threads in the process from accessing the object in question.

// X is a reference type
lock(X)
{
// code that executes with mutual exclusion lock on X
}

Note that in this code, X must be a reference type, and it is the object referred to by X , not the variable X , which is subject to the lock. Internally, lock() works by providing a C# syntax wrapper around certain methods on the System.Threading.Monitor class. Further details of thread support are provided in Chapter 7, and in the MSDN documentation for the System.Threading namespace.

Keywords

In this section we present a list of all the recognized C# keywords, with their approximate meanings.

KeywordMeaning
abstractmarks that a class cannot be instantiated but simply serves as a base class for other classes.
baseUsed to reference the base class of a class.
boolThe struct, System.Bool.
breakBreaks out of the controlling for, foreach, do, while or switch statement.
byteThe struct, System.Byte
caseIndicates the start of commands for a particular value of the control variable in a switch statement
catchMarks the start of a block that throw transfers execution to
charThe struct, System.Char.
checkedIndicates the following block of code should run in a checked context: Overflow errors etc. should cause exceptions to be raised.
classDefines a new reference type
constIndicates that a variable or field cannot be assigned to
continueTransfers control to the next iteration of the controlling for, foreach, do or while statement.
decimalThe struct, System.Decimal
defaultMarks code that should be executed if none of the tests in a switch statement were matched
delegateDefines a new type derived from System.Delegate
doStart of a loop that executes at least once and then continually until a specified condition is false.
doubleThe struct, System.Double
elseMarks the code block that should be executed if none of the conditions in an if .. else if block were true
enumDefines a new type derived from System.Enum
eventSpecial delegate designed to work with Windows event architecture
explicitIndicates that a cast may only be used explicitly
externIndicates that a method is implemented as a function by an external DLL.
falseA Boolean value
finallyMarks block of code that should always be executed on leaving a try block
fixedPrevents the garbage collector from moving a class
floatThe struct, System.Single
forA sophisticated loop that allows the developer to determine the action to be taken at the beginning of each iteration and the test of whether to exit the loop
foreachA loop that enumerates over items in a collection
gotoTransfers control to the labeled statement
ifIndicates the following statement should only be executed if some condition is true
implicitIndicates that a cast may be used either implicitly or explicitly
inHelper keyword for the foreach loop
intThe struct, System.Int16
interfaceDefines an interface
internalMarks an item as only being visible to code within the same assembly
isTests whether an item is of a given data type
lockWraps a mutual exclusion lock (mutex) around an object on the heap for the duration of a statement or block statement
longThe struct, System.Int32
namespaceDefines a namespace
newCalls the constructor of an object. Also indicates that a method hides a base class methods.
nullThe null reference - indicates that a reference doesn't refer to anything.
objectA reference to the class System.Object
operatorDefines an operator overload
outMarks a parameter as being on that the method being called will set the value of
overrideIndicates that a method overrides a similarly named method in the base class
paramsIndicates that a method takes a variable argument list
privatePrevents any code outside the class from being able to see a class member
protectedPrevents any code outside the class and derived classes from being able to see a class member
publicMake a member visible to all other code
readonlyPrevents a field from being assigned to other than in the constructor of the containing data type
refIndicates that the value of a parameter may be modified by the method being called
returnReturns control to the next method up the call stack
sbyteThe struct, System.SByte
sealedPrevents any classes from being derived from a class
shortThe struct, System.Short
sizeofReturns the number of bytes occupied by a particular data type
stackallocFor creating stack based arrays of value types in unsafe code.
staticIndicates that a member of a class or struct is not associated with any particular instance of that class or struct
stringThe class, System.String
structDefines a value data type
switchPresents a number of sets of statements, with control to be transferred to one of them depending on the value of some variable
thisThe class or struct instance with which this method is associated
throwTransfers control to the next catch statement that is able to handle a particular type of exception
trueThe Boolean value indicating true
tryIndicates that exceptions may be generated by the following block of code and are handled by the subsequent catch block
typeofReturns a System.Type object that gives detailed information about a named type
uintThe struct, System.UInt16
ulongThe struct, System.UInt32
uncheckedCancels a checked statement. Returns to the default behaviour for overflow errors.
unsafeMarks a method or class as being one in which pointers and pointer operations may be used
ushortThe struct, System.UShort
usingIndicates a namespace that will not be named explicitly, or alternatively supplies an alternative name for a particular type within some namespace. This can also be used to indicate that IDisposable.Dispose() should be called on a variable when it goes out of scope.
virtualIndicates that a method may be overridden
voidSpecifies that a method does not return anything, or that a we are not indicating the type that a pointer points to
whileLoop that tests some condition then executes until this condition is false.
 

 ASPToday - © 2002 Wrox Press Limited