Thursday, March 12, 2009

Structure Lesson 1

#include
#include
void main(void)
{
clrscr();

Declaring structure

struct person
{
char name[80];
int age;
int mobileNo;
};

Defining its object

struct person prs;

printf("Enter your name: ");

gets(prs.name);

printf("Enter your age: ");

scanf("%d",&prs.age);

printf("Enter your mobileNo: ");

scanf("%d",&prs.mobileNo);

printf("Name = %s \n Age = %d \n MobileNo = %d",prs.name,prs.age,prs.mobileNo);

getch();
}

C Structure

#include
#include
void main (void)
{
clrscr();
struct string
{
char name[80];
};
struct string str;
printf("Enter your name: ");
gets(str.name);
getch();
}

Wednesday, March 11, 2009

Pascal Programming Language

Pascal (programming language)
From Wikipedia, the free encyclopedia
(Redirected from Pascal (language))
Jump to: navigation, search
This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unverifiable material may be challenged and removed. (March 2009)

Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.

A derivative known as Object Pascal was designed for object oriented programming.

Pascal Paradigm imperative, structured
Appeared in 1970, last revised 1992
Designed by Niklaus Wirth
Typing discipline static, strong, safe
Major implementations CDC 6000, Pascal-P, PDP-11, PDP-10, IBM System/370, HP, GNU
Dialects UCSD, Borland, Turbo
Influenced by ALGOL
Influenced Modula-2, Oberon, Oberon-2, Component Pascal Ada, Object Pascal, Oxygene
Contents [hide]
1 History
2 Brief description
3 Implementations
4 Language constructs
4.1 Hello world
4.2 Data types
4.3 Data structures
4.3.1 Pointers
4.4 Control structures
4.5 Procedures and functions
5 Resources
5.1 Compilers and interpreters
6 Standards
6.1 Divisions
6.2 List of related standards
7 Reception
7.1 Criticism
7.1.1 Reactions
8 See also
9 Further reading
10 External links



[edit] History
Pascal is based on the ALGOL programming language and named in honor of the French mathematician and philosopher Blaise Pascal. Wirth subsequently developed the Modula-2 and Oberon, languages similar to Pascal. Before, and leading up to Pascal, Wirth developed the language Euler, followed by Algol-W.

Initially, Pascal was largely, but not exclusively, intended to teach students structured programming. Generations of students have "cut their teeth" on Pascal as an introductory language in undergraduate courses. Variants of Pascal have also frequently been used for everything from research projects to PC games and embedded systems. Newer Pascal compilers exist which are widely used.

Pascal was the primary high-level language used for development in the Apple Lisa, and in the early years of the Mac; parts of the original Macintosh operating system were hand-translated into Motorola 68000 assembly language from the Pascal sources. The popular typesetting system TeX by Donald E. Knuth was written in WEB, the original literate programming system, based on DEC PDP-10 Pascal, while an application like Total Commander was written in Delphi (i.e. Object Pascal).


[edit] Brief description
Wirth's intention was to create an efficient language (regarding both compilation speed and generated code) based on so-called structured programming, a concept which had recently become popular. Pascal has its roots in the Algol 60 language, but also introduced concepts and mechanisms which (on top of Algol's scalars and arrays) enabled programmers to define their own complex (structured) datatypes, and also made it easier to build dynamic and recursive data structures such as lists, trees and graphs. Important features included for this were records, enumerations, subranges, dynamically allocated variables with associated pointers, and sets. To make this possible and meaningful, Pascal has a strong typing on all objects, which means that one type of data cannot be converted or interpreted as another without explicit conversions. Similar mechanisms are standard in many programming languages today. Other languages that influenced Pascal's development were COBOL, Simula 67, and Wirth's own Algol-W .

Pascal, like many scripting languages of today (but unlike most languages in the C family), allows nested procedure definitions to any level of depth, and also allows most kinds of definitions and declarations inside procedures and functions. This enables a very simple and coherent syntax where a complete program is syntactically nearly identical to a single procedure or function (except for the keyword itself, of course).


[edit] Implementations
The first Pascal compiler was designed in Zurich for the CDC 6000 series mainframe computer family. Niklaus Wirth reports that a first attempt to implement it in Fortran in 1969 was unsuccessful due to Fortran's inadequacy to express complex data structures. The second attempt was formulated in the Pascal language itself and was operational by mid-1970. Many Pascal compilers since have been similarly self-hosting, that is, the compiler is itself written in Pascal, and the compiler is usually capable of recompiling itself when new features are added to the language, or when the compiler is to be ported to a new environment. The GNU Pascal compiler is one notable exception, being written in C.

The first successful port of the CDC Pascal compiler to another mainframe was completed by Welsh and Quinn at the QUB in 1972. The target was the ICL 1900 series. This compiler in turn was the parent of the Pascal compiler for the ICS Multum minicomputer. The Multum port was developed – with a view to using Pascal as a systems programming language – by Findlay, Cupples, Cavouras and Davis, working at the Department of Computing Science in Glasgow University. It is thought that Multum Pascal, which was completed in the summer of 1973, may have been the first 16-bit implementation.

A completely new compiler was completed by Welsh et al. at QUB in 1977. It offered a source-language diagnostic feature (incorporating profiling, tracing and type-aware formatted postmortem dumps) that was implemented by Findlay and Watt at Glasgow University. This implementation was ported in 1980 to the ICL 2900 series by a team based at Southampton University and Glasgow University. The Standard Pascal Model Implementation was also based on this compiler, having been adapted, by Welsh and Hay at Manchester University in 1984, to check rigorously for conformity to the BSI 6192/ISO 7185 Standard and to generate code for a portable abstract machine.

The first Pascal compiler written in North America was constructed at the University of Illinois under Donald B. Gillies for the PDP-11 and generated native machine code. Pascal enjoyed great popularity throughout the 1970s and the 1980s.

In order to rapidly propagate the language, a compiler "porting kit" was created in Zurich that included a compiler that generated code for a "virtual" stack machine (i.e. code that lends itself to reasonably efficient interpretation), along with an interpreter for that code - the Pascal-P system. Although the SC (Stack Computer) code was primarily intended to be compiled into true machine code, at least one system, the notable UCSD implementation, utilized it to create the interpretive UCSD p-System. The P-system compilers were termed P1-P4, with P1 being the first version, and P4 being the last to come from Zurich.

The P4 compiler/interpreter can still be run and compiled on systems compatible with original Pascal. However, it only itself accepts a subset of the Pascal language. A version of P4 that accepts the full Pascal language and includes ISO 7185 compatibility was created and termed the P5 compiler, which is available in source form.

A version of the P4 compiler, which created native binaries, was released for the IBM System/370 mainframe computer by the Australian Atomic Energy Commission; it was called the "AAEC Pascal Compiler" after the abbreviation of the name of the Commission. A version of P4 from 1975-6 including source and binaries for the compiler and run-time library files for the PDP-10 mainframe may be downloaded from this link.

In the early 1980s, Watcom Pascal was developed, also for the IBM System 370.

IP Pascal was an implementation of the Pascal programming language using Micropolis DOS, but was moved rapidly to CP/M running on the Z80. It was moved to the 80386 machine types in 1994, and exists today as Windows/XP and Linux implementations. In 2008, the system was brought up to a new level and the resulting language termed "Pascaline" (after Pascal's calculator). It includes objects, namespace controls, dynamic arrays, along with many other extensions, and generally features the same functionality and type protection as C#. It is the only such implementation which is also compatible with the original Pascal implementation (which is standardized as ISO 7185).

In the early 1980s, UCSD Pascal was ported to the Apple II and Apple III computers to provide a structured alternative to the BASIC interpreters that came with the machines.

Apple Computer created its own Lisa Pascal for the Lisa Workshop in 1982 and ported this compiler to the Apple Macintosh and MPW in 1985. In 1985 Larry Tesler, in consultation with Niklaus Wirth, defined Object Pascal and these extensions were incorporated in both the Lisa Pascal and Mac Pascal compilers.

In the 1980s Anders Hejlsberg wrote the Blue Label Pascal compiler for the Nascom-2. A reimplementation of this compiler for the IBM PC was marketed under the names Compas Pascal and PolyPascal before it was acquired by Borland. Renamed to Turbo Pascal it became hugely popular, thanks in part to an aggressive pricing strategy and in part to having one of the first full-screen Integrated development environments. Additionally, it was written and highly optimized entirely in assembly language, making it smaller and faster than much of the competition. In 1986 Anders ported Turbo Pascal to the Macintosh and incorporated Apple's Object Pascal extensions into Turbo Pascal. These extensions were then added back into the PC version of Turbo Pascal for version 5.5.

The inexpensive Borland compiler had a large influence on the Pascal community that began concentrating mainly on the IBM PC in the late 1980s. Many PC hobbyists in search of a structured replacement for BASIC used this product. It also began adoption by professional developers. Around the same time a number of concepts were imported from C in order to let Pascal programmers use the C-based API of Microsoft Windows directly. These extensions included null-terminated strings, pointer arithmetic, function pointers, an address-of operator and unsafe typecasts.

However, Borland later decided it wanted more elaborate object-oriented features, and started over in Delphi using the Object Pascal draft standard proposed by Apple as a basis. (This Apple draft is still not a formal standard.) The first versions of the Delphi Programming Language were accordingly named Object Pascal. The main additions compared to the older OOP extensions were a reference-based object model, virtual constructors and destructors, and properties. Several other compilers also implement this dialect.

Turbo Pascal, and other derivatives with units or module concepts are modular languages. However, it does not provide a nested module concept or qualified import and export of specific symbols.

Super Pascal was a variant which added non-numeric labels, a return statement and expressions as names of types.

The universities of Zurich, Karlsruhe and Wuppertal have developed an EXtension for Scientific Computing (Pascal XSC), which provides a free solution for programming numerical computations with controlled precision.

In 2005, at the Web 2.0 conference, Morfik Technology introduced a tool which allowed the development of Web applications entirely written in Morfik Pascal. Morfik Pascal is a dialect of Object Pascal, very close to Delphi.


[edit] Language constructs
Pascal, in its original form, is a purely procedural language and includes the traditional array of Algol-like control structures with reserved words such as if, then, else, while, for, and so on. However, Pascal also has many data structuring facilities and other abstractions which were not included in the original Algol60, like type definitions, records, pointers, enumerations, and sets. Such constructs were in part inherited or inspired from Simula67, Algol68, Niklaus Wirth's own AlgolW and suggestions by C. A. R. Hoare.


[edit] Hello world
Pascal programs start with the program keyword with a list of external file descriptors as parameters; then follows the main statement block encapsulated by the begin and end keywords. Semicolons separate statements, and the full stop ends the whole program (or unit). Letter case is ignored in Pascal source. Some compilers, Turbo Pascal among them, have made the program keyword optional.

Here is an example of the source code in use for a very simple "Hello world" program:

Program HelloWorld(output);
begin
writeLn('Hello, World!')
end.

[edit] Data types
A type in Pascal, and in several other popular programming languages, defines a variable in such a way that it defines a range of values which the variable is capable of storing, and it also defines a set of operations that are permissible to be performed on variables of that type. The types and a very brief description follows;

Data type Range of values which the variable is capable of storing
integer Whole numbers from -32768 to 32767
byte The integers from 0 to 255
real Floating point numbers from 1E-38 to 1E+38
boolean Can only have the value TRUE or FALSE
char Any character in the ASCII character set


[edit] Data structures
Pascal's simple (atomic) types are real, integer, character, boolean and enumerations, a new type constructor introduced with Pascal:

var
r: Real;
i: Integer;
c: Char;
b: Boolean;
e: (apple, pear, banana, orange, lemon);
Subranges of any ordinal type (any simple type except real) can be made:

var
x: 1..10;
y: 'a'..'z';
z: pear..orange;
In contrast with other programming languages from its time, Pascal supports a set type:

var
set1: set of 1..10;
set2: set of 'a'..'z';
set3: set of pear..orange;
A set is fundamental concept for modern mathematics, and they may be used in a great many algorithms. Such a feature is highly useful and may be faster than an equivalent construct in a language that does not support sets. For example, for many Pascal compilers:

if i in [5..10] then
...
is faster, than

if (i>4) and (i<11) then
...
Types can be defined from other types using type declarations:

type
x = Integer;
y = x;
...
Further, complex types can be constructed from simple types:

type
a = Array [1..10] of Integer;
b = record
x: Integer;
y: Char
end;
c = File of a;
As shown in the example above, Pascal files are sequences of components. Every file has a buffer variable which is denoted by f^. The procedures get (for reading) and put (for writing) move the buffer variable to the next element. Read is introduced such that read(f, x) is the same as x:=f^; get(f);. Write is introduced such that write(f, x) is the same as f^ := x; put(f); The type text is predefined as file of char. While the buffer variable could be used to inspect the next character that would be used (check for a digit before reading an integer), this concept lead to serious problems with interactive programs with early implementations, but was solved later with the "lazy I/O" concept.

In Jensen & Wirth Pascal, strings are represented as packed arrays of chars; they therefore have fixed length and are usually space-padded. Some dialects have a custom string type.


[edit] Pointers
Pascal supports the use of pointers:

type
a = ^b;
b = record
a: Integer;
b: Char;
c: a
end;
var
pointertob: a;
Here the variable pointer_to_b is a pointer to the data type b, a record. Pointers can be used before they are declared. This is a forward declaration, an exception to the rule that things must be declared before they are used. To create a new record and assign the value 10 and character A to the fields a and b in the record, and to initialise the pointer c to nil, the commands would be:

new(pointer_to_b);
pointertob^.a := 10;
pointertob^.b := 'A';
pointertob^.c := nil;
...
This could also be done using the with statement, as follows

new(pointer_to_b);

with pointertob^ do
begin
a := 10;
b := 'A';
c := nil
end;
...
Note that inside of the scope of the with statement, the compiler knows that a and b refer to the subfields of the record pointer pointertob and not to the record b or the pointer type a.

Linked lists, stacks and queues can be created by including a pointer type field (c) in the record (see also nil and null (computer programming)).


[edit] Control structures
Pascal is a structured programming language, meaning that the flow of control is structured into standard statements, ideally without 'go to' commands.

while a <> b do writeln('Waiting');

if a > b then writeln('Condition met')
else writeln('Condition not met');

for i := 1 to 10 do writeln('Iteration: ', i:1);

repeat
a := a + 1
until a = 10;

case i of
0: write('zero');
1: write('one');
2: write('two')
end;

[edit] Procedures and functions
Pascal structures programs into procedures and functions.

program mine(output);

var i : integer;

procedure print(var j: integer);

function next(k: integer): integer;
begin
next := k + 1
end;

begin
writeln('The total is: ', j);
j := next(j)
end;

begin
i := 1;
while i <= 10 do print(i)
end.
Procedures and functions can nest to any depth, and the 'program' construct is the logical outermost block.

Each procedure or function can have its own declarations of goto labels, constants, types, variables, and other procedures and functions, which must all be in that order. This ordering requirement was originally intended to allow efficient single-pass compilation. However, in some dialects the strict ordering requirement of declaration sections is not required.


[edit] Resources

[edit] Compilers and interpreters
Several Pascal compilers and interpreters are available for the use of general public:

Delphi is CodeGear's (formerly Borland) flagship RAD (Rapid Application Development) product. It uses the Object Pascal language (Dubbed the 'Delphi programming language' by Borland), descended from Pascal, to create applications for the windows platform. The .NET support that existed from D8 through D2005,D2006 and D2007 has been terminated, and replaced by a new language (Prism, which is rebranded Oxygene, see below) that is not fully backwards compatible. The most recent iteration of the win32 range (D2009) adds unicode and generics support. A version of Delphi (D2006), Turbo Delphi Explorer, is available for free download.
Free Pascal (www.freepascal.org) is a multi-platform compiler written in Pascal (it is Self-hosting). It is aimed at providing a convenient and powerful compiler, both able to compile legacy applications and to be the means of developing new ones. It is distributed under the GNU GPL. Apart from compatibility modes for Turbo Pascal, Delphi and Mac Pascal, it also has its own procedural and object oriented syntax modes with support for extended features such as operator overloading. It supports many platforms and operating systems.
Lazarus (lazarus.freepascal.org) is a Delphi-like visual cross-platform IDE for RAD (Rapid Application Development). Based on FreePascal, Lazarus is available for numerous platforms including Linux, FreeBSD, Mac OS X and Microsoft Windows.
Dev-Pascal (Dev-Pascal) is a Pascal IDE that was designed in Borland Delphi and which supports both Free Pascal and GNU Pascal as backend. Contrary to its C++ sibling, it has not seen a significant release in years
Oxygene (formerly known as Chrome) is a Next Generation Object Pascal compiler for the .NET and Mono Platforms. It was created and is sold by RemObjects Software, and recently by Codegear/Emarcadero as Prism It tries to carry the spirit of Pascal to .NET, but is not very compatible to other Pascals.
Kylix was a descendant of Delphi, with support for the Linux operating system and an improved object library. The compiler and the IDE are available now for non-commercial use. The product is no longer supported by Borland.
GNU Pascal Compiler (GPC) is the Pascal compiler of the GNU Compiler Collection (GCC). The compiler itself is written in C, the runtime library mostly in Pascal. Distributed freely under the GNU General Public License, it runs on many platforms and operating systems. It supports the ANSI/ISO standard languages and partial Borland/Turbo Pascal language support. One of the more painful omissions is the absence of a 100% TP compatible string type. Support for Borland Delphi and other language variations is quite limited, except maybe for Mac Pascal, the support for which is growing fast.
Virtual Pascal was created by Vitaly Miryanov in 1995 as a native OS/2 compiler compatible with Borland Pascal syntax. Then, it had been commercially developed by fPrint, adding Win32 support, and in 2000 it became freeware. Today it can compile for Win32, OS/2 and Linux, and is mostly compatible with Borland Pascal and Delphi. Development on this compiler was canceled on April 4, 2005.
P4 compiler, the basis for many subsequent Pascal-implemented-in-Pascal compilers, including the UCSD p-System. It implements a subset of full Pascal.
P5 compiler, is an ISO 7185 (full Pascal) adaption of P4.
Turbo Pascal was the dominant Pascal compiler for PCs during the 80s and early 90s, popular both because of its powerful extensions and extremely short compilation times. Turbo Pascal was compactly written and could compile, run, and debug all from memory without accessing disk. Slow floppy disk drives were common for programmers at the time, further magnifying Turbo Pascal's speed advantage. Currently, older versions of Turbo Pascal (up to 5.5) are available for free download from Borland's site.
Turbo51 (turbo51.com) is a free Pascal compiler for the 8051 family of microcontrollers (uses Turbo Pascal 7 syntax)
Dr. Pascal is an interpreter that runs Standard Pascal. Notable are the "visible execution" mode that shows a running program and its variables, and the extensive runtime error checking. Runs programs but does not produce a separate executable binary. Runs on MS-DOS, Windows in DOS window, and old Macintosh.
Dr. Pascal's Extended Pascal Compiler tested on DOS, Windows 3.1, 95, 98, NT.
IP Pascal Implements the language "Pascaline" (named after Pascal's calculator), which is a highly extended Pascal compatible with original Pascal according to ISO 7185. It features modules with namespace control, including parallel tasking modules with semaphores, objects, dynamic arrays of any dimensions that are allocated at runtime, overloads, overrides, and many other extensions. IP Pascal has a built-in portability library that is custom tailored to the Pascal language. For example, a standard text output application from 1970's original Pascal can be recompiled to work in a window and even have graphical constructs added.
PocketStudio is a Pascal subset compiler/RAD targeting Palm / MC68xxx with some own extensions to assist interfacing with the Palm OS API.
MIDletPascal - A Pascal compiler and IDE that generates small and fast Java bytecode specifically designed to create software for mobiles
Vector Pascal Vector Pascal is a language targeted at SIMD instruction sets such as the MMX and the AMD 3d Now, supporting all Intel and AMD processors, as well as the Sony Playstation 2 Emotion Engine.
Morfik Pascal allows the development of Web applications entirely written in Object Pascal (both server and browser side).
web Pascal (www.codeide.com) is an online IDE and Pascal compiler.
WDSibyl - Visual Development Environment and Pascal compiler for Win32 and OS/2
PP Compiler, a compiler for Palm OS that runs directly on the handheld computer
CDC 6000 Pascal compiler The source code for the first (CDC 6000) Pascal compiler.
Pascal-S - "Pascal-S: A Subset and Its Implementation", N. Wirth in Pascal - The Language and Its Implementation, by D.W. Barron, Wiley 1979.
A very extensive list can be found on Pascaland. The site is in French, but it is basically a list with URLs to compilers; there is little barrier for non-Francophones. The site, Pascal Central, a Mac centric Pascal info and advocacy site with a rich collection of article archives, plus links to many compilers and tutorials, may also be of interest.


[edit] Standards
In 1983, the language was standardized, in the international standard ISO/IEC 7185, as well as several local country specific standards, including the American ANSI/IEEE770X3.97-1983, and ISO 7185:1983. These two standards differed only in that the ISO standard included a "level 1" extension for conformant arrays, where ANSI did not allow for this extension to the original (Wirth version) language. In 1989, ISO 7185 was revised (ISO 7185:1990) to correct various errors and ambiguities found in the original document.

In 1990, an extended Pascal standard was created as ISO/IEC 10206. In 1993 the ANSI standard was replaced by the ANSI organization with a "pointer" to the ISO 7185:1990 standard, effectively ending its status as a different standard.

The ISO 7185 was stated to be a clarification of Wirth's 1974 language as detailed by the User Manual and Report [Jensen and Wirth], but was also notable for adding "Conformant Array Parameters" as a level 1 to the standard, level 0 being Pascal without Conformant Arrays.

Note that Niklaus Wirth himself referred to the 1974 language as "the Standard", for example, to differentiate it from the machine specific features of the CDC 6000 compiler. This language was documented in "The Pascal Report", the second part of the "Pascal users manual and report".

On the large machines (mainframes and minicomputers) Pascal originated on, the standards were generally followed. On the IBM-PC, they were not. On IBM-PCs, the Borland standards Turbo Pascal and Delphi have the greatest number of users. Thus, it is typically important to understand whether a particular implementation corresponds to the original Pascal language, or a Borland dialect of it.

The IBM-PC versions of the language began to differ with the advent of UCSD Pascal, an interpreted implementation that featured several extensions to the language, along with several omissions and changes. Many UCSD language features survive today, including in Borland's dialect.


[edit] Divisions
Niklaus Wirth's Zurich version of Pascal was issued outside of ETH in two basic forms, the CDC 6000 compiler source, and a porting kit called Pascal-P system. The Pascal-P compiler left out several features of the full language. For example, procedures and functions used as parameters, undiscriminated variant records, packing, dispose, interprocedural gotos and other features of the full compiler were omitted.

UCSD Pascal, under Professor Kenneth Bowles, was based on the Pascal-P2 kit, and consequently shared several of the Pascal-P language restrictions. UCSD Pascal was later adopted as Apple Pascal, and continued through several versions there. Although UCSD Pascal actually expanded the subset Pascal in the Pascal-P kit by adding back standard Pascal constructs, it was still not a complete standard installation of Pascal.

Borland's Turbo Pascal, written by Anders Hejlsberg was written in assembly language independent of UCSD or the Zurich compilers. However, it adopted much of the same subset and extensions as the UCSD compiler. This is probably because the UCSD system was the most common Pascal system suitable for developing applications on the resource-limited microprocessor systems available at that time.


[edit] List of related standards
ISO 8651-2:1988 Information processing systems -- Computer graphics -- Graphical Kernel System (GKS) language bindings -- Part 2: Pascal

[edit] Reception
Pascal generated a wide variety of responses in the computing community, both critical and complimentary.


[edit] Criticism
While very popular (although more so in the 1980s and early 1990s than now), early versions of Pascal have been widely criticized for being unsuitable for "serious" use outside of teaching. Brian Kernighan, who popularized the C programming language, outlined his most notable criticisms of Pascal as early as 1981, in his paper Why Pascal Is Not My Favorite Programming Language. On the other hand, many major development efforts in the 1980s, such as for the Apple Lisa and Macintosh, heavily depended on Pascal (to the point where the C interface for the Macintosh operating system API had to deal in Pascal data types).


[edit] Reactions
In the decades since then, Pascal in fact has been continuing to evolve, and most of Kernighan's points do not apply to current implementations. Unfortunately, just as Kernighan predicted in his article, most of the extensions to fix these issues were incompatible from compiler to compiler. In the last decade, however, the varieties seem to have condensed into two categories, ISO and Borland like, a better eventual outcome than Kernighan foresaw.[original research?]

Although Kernighan decried Pascal's lack of type escapes ("there is no escape" from "Why Pascal is not my Favorite Programming language"), the uncontrolled use of pointers and type escapes have become highly criticized features in their own right, and the languages Java, C# and others feature a sharp turn-around to the Pascal point of view. What these languages call "managed pointers" were in fact foreseen by Wirth with the creation of Pascal.

Based on his experience with Pascal (and earlier with ALGOL) Niklaus Wirth developed several more programming languages: Modula, Modula-2 and Oberon. These languages address some criticisms of Pascal, are intended for different user populations, and so on, but none has had the widespread impact on computer science and computer users as has Pascal, nor has any yet met with similar commercial success.


[edit] See also
Wikibooks has a book on the topic of
Pascal
Alphabetical list of programming languages
ALGOL
Ada programming language
Delphi programming language
Comparison of Pascal and Borland Delphi
Modula programming language
Modula 2
Oberon programming language
Object Pascal
IP Pascal
Oxygene
Concurrent Pascal
Comparison of Pascal and C
C (programming language)
Comparison of Pascal IDEs
Real Programmers Don't Use Pascal

[edit] Further reading
Niklaus Wirth: The Programming Language Pascal. 35-63, Acta Informatica, Volume 1, 1971.
C A R Hoare: Notes on data structuring. In O-J Dahl, E W Dijkstra and C A R Hoare, editors, Structured Programming, pages 83–174. Academic Press, 1972.
C. A. R. Hoare, Niklaus Wirth: An Axiomatic Definition of the Programming Language Pascal. 335-355, Acta Informatica, Volume 2, 1973.
Kathleen Jensen and Niklaus Wirth: PASCAL - User Manual and Report. Springer-Verlag, 1974, 1985, 1991, ISBN 0-387-97649-3 and ISBN 3-540-97649-3[1]
Niklaus Wirth: Algorithms + Data Structures = Programs. Prentice-Hall, 1975, ISBN 0-13-022418-9[2]
Niklaus Wirth: An assessment of the programming language PASCAL 23-30 ACM SIGPLAN Notices Volume 10, Issue 6, June 1975.
N. Wirth, and A. I. Wasserman, ed: Programming Language Design. IEEE Computer Society Press, 1980
D. W. Barron (Ed.): Pascal - The Language and its Implementation. John Wiley 1981, ISBN 0-471-27835-1
Peter Grogono: Programming in Pascal, Revised Edition, Addison-Wesley, 1980
Richard S. Forsyth: Pascal in Work and Play, Chapman and Hall, 1982
N. Wirth, M. Broy, ed, and E. Denert, ed: Pascal and its Successors in Software Pioneers: Contributions to Software Engineering. Springer-Verlag, 2002, ISBN 3-540-43081-4
N. Wirth: Recollections about the Development of Pascal. ACM SIGPLAN Notices, Volume 28, No 3, March 1993.

[edit] External links
Pascal Language Tutorial
The Pascal Programming Language
Standard Pascal – Resources and history of original, standard Pascal

Q Basic Programming Language

QBasic
From Wikipedia, the free encyclopedia
(Redirected from QBASIC programming language)
Jump to: navigation, search
QBasic
Appeared in 1991 - 2000
Developer Microsoft Corporation
Influenced by QuickBASIC, GW-BASIC
OS MS-DOS, Windows 95, Windows 98, Windows Me, PC-DOS, OS/2, eComStation
License MS-EULA
Website www.microsoft.com
QBasic is an IDE and interpreter for a variant of the BASIC programming language which is based on QuickBasic. Code entered into the IDE is compiled to an intermediate form, and this intermediate form is immediately interpreted on demand within the IDE.[1] It can run under nearly all versions of DOS and Windows, or through DOSBox/DOSEMU, on Linux and FreeBSD.[2] For its time, QBasic provided a state-of-the-art IDE, including a debugger with features such as on-the-fly expression evaluation and code modification that were still relatively unusual more than ten years later.[citation needed]

Like QuickBASIC, but unlike earlier versions of Microsoft BASIC, QBasic is a structured programming language, supporting constructs such as subroutines and while loops.[3][4] Line numbers, a concept often associated with BASIC, are supported for compatibility, but are not considered good form, having been replaced by descriptive line labels.[1] QBasic has limited support for user-defined data types (structures), and several primitive types used to contain strings of text or numeric data.[5][6]

Contents [hide]
1 History
2 Examples
2.1 "Hello, World!"
2.2 Simple game
3 Easter egg
4 References
5 See also
6 External links



[edit] History
QBasic was intended as a replacement for GW-BASIC. It was based on the earlier QuickBASIC 4.5 compiler but without QuickBASIC's compiler and linker elements. Version 1.0 was shipped together with MS-DOS 5.0 and higher, as well as Windows 95, Windows NT 3.x, and Windows NT 4.0. IBM recompiled QBasic and included it in PC-DOS 5.x, as well as OS/2 2.0 onwards.[7] eComStation, descended from OS/2 code, includes QBasic 1.0. QBasic 1.1 is included with MS-DOS 6.x, and, without EDIT, in Windows 95, Windows 98 and Windows Me. Starting with Windows 2000, Microsoft no longer includes QBasic with their operating systems.[8] (Although, some localized versions of Windows 2000 and Windows XP still have it.)

QBasic (as well as the built-in MS-DOS Editor) is backward compatible with DOS releases prior to 5.0 (down to at least DOS 3.20). However, if used on any 8088/8086 computers, or on some 80286 computers, the QBasic program may run very slowly, or perhaps not at all, due to its memory size. Until MS-DOS 7, MS-DOS Editor required QBasic. The "edit.com" program simply started QBasic in editor mode only.


[edit] Examples
QBasic came complete with four pre-written example programs. These were "Nibbles" (a variant of the Snake game), "Gorillas", an explosive-banana throwing game derived from Artillery game first produced on the Tektronix 4051 and later HP 2647, "MONEY MANAGER", a personal finance manager and "RemLine", a GW-BASIC code line number removing program.[1]


[edit] "Hello, World!"
PRINT "Hello, World!"

[edit] Simple game
This program challenges the user to guess a randomly selected number within the 1-10 range, without offering the usual hints of "higher"/"lower":

CLS
PRINT "Guess My Number"
INPUT "Would you like to play"; choice$ 'An input statement, that takes what the user inputs...
choice$ = UCASE$(choice$) ' makes the input completely uppercase (fkld ---> FKLD)
IF choice$ <> "YES" AND choice$ <> "Y" THEN ' and decides whether or not they want to play:
END
END IF
guesses% = 5 ' Set up number of guess remaining
RANDOMIZE TIMER ' Sets up the random number generator
target% = INT(RND * 10) + 1
WHILE guesses% > 0
INPUT "Guess a number: ", guess% ' Takes user input (the guess)
IF guess% = target% THEN ' Determines if the guess was correct
PRINT "You win!"
END
ELSE
guesses% = guesses% - 1
PRINT "Sorry, please try again. You have only ";guesses%;" times"
END IF
WEND

PRINT "You ran out of guesses, the number was "; target%
END

[edit] Easter egg
QBasic has a little known easter egg. To see it, press and hold Left CTRL+Left SHIFT+Left ALT and Right CTRL+Right SHIFT+Right ALT simultaneously after running QBasic at the DOS prompt but before the title screen loads: this lists The Team of programmers.[7] Note that on modern computers, it is much too fast to perform. It is best done on an old PC (preferably one with a working Turbo button, with the switch on to slow the CPU to 4.77 MHz) or in an emulator like Bochs or DOSBox which can be slowed down.


[edit] References
^ a b c "Differences Between GW-BASIC and QBasic". 2003-05-12. http://support.microsoft.com/kb/73084. Retrieved on 2008-06-28.
^ "HOWTO Play With Your Old QBasic Programs on Linux". 2007-03-31. http://penguinpetes.com/b2evo/index.php?title=howto_play_with_your_old_qbasic_programs. Retrieved on 2008-06-28.
^ "QBASIC Manual: SUB...END SUB Statement QuickSCREEN". http://www.qbasicnews.com/qboho/qcksub.shtml. Retrieved on 2008-06-28.
^ "QBASIC Manual: WHILE...WEND Statement QuickSCREEN". http://www.qbasicnews.com/qboho/qckwend.shtml. Retrieved on 2008-06-28.
^ "QBASIC Manual: TYPE Statement QuickSCREEN". http://www.qbasicnews.com/qboho/qcktype.shtml. Retrieved on 2008-06-28.
^ "QBASIC Manual: Limits - Names, Strings, and Numbers". http://www.qbasicnews.com/qboho/qckadvr@l8207.shtml. Retrieved on 2008-06-28.
^ a b "Microsoft BASIC version information". http://www.emsps.com/oldtools/msbasv.htm#qbasic. Retrieved on 2008-06-12.
^ "QBasic Missing from Windows 2000". 2007-03-01. http://support.microsoft.com/kb/258265. Retrieved on 2008-06-12.

[edit] See also
Wikibooks has a book on the topic of
QBasic
FreeBASIC
True Basic
Visual Basic
PowerBASIC
PureBasic
QB64

[edit] External links
Download QBASIC 1.1 from Microsoft (included in the "Old MS-DOS Utilities" part of Windows 95 CD-ROM Extras)
QB Express - Online magazine about QBasic programming
The QBasic Station - Created in 1997 by Jack Thomson, it is one of the oldest, but still active, QBasic sites on the web.

C Sharp Language

C Sharp (programming language)
From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article may require copy-editing for grammar, style, cohesion, tone or spelling. You can assist by editing it now. A how-to guide is available. (August 2008)

The correct title of this article is C# (programming language). The substitution or omission of a # sign is because of Wikipedia's technical restrictions for naming articles.
C# Paradigm structured, imperative, object-oriented.
Appeared in 2001
Designed by Microsoft Corporation
Latest release 3/ 19 November 2007
Typing discipline static, strong, safe, nominative
Major implementations .NET Framework, Mono, DotGNU
Influenced by Object Pascal, C++, Modula-3, Java, Eiffel, C
Influenced F#, Nemerle, D, Java[1], Vala, Windows PowerShell
C# (pronounced C Sharp) is a multi-paradigm programming language that encompasses functional, imperative, generic, object-oriented (class-based), and component-oriented programming disciplines. It was developed by Microsoft as part of the .NET initiative and later approved as a standard by ECMA (ECMA-334) and ISO (ISO/IEC 23270). C# is one of the programming languages supported by the .NET Framework's Common Language Runtime.

C# is intended to be a simple, modern, general-purpose, object-oriented programming language. Its development team is led by Anders Hejlsberg, the designer of Borland's Object Pascal language. It has an object-oriented syntax based on C++ and is heavily influenced by Java. It was initially named Cool, which stood for "C-like Object Oriented Language." However, in July 2000, when Microsoft made the project public, the name of the programming language was given as C#. The most recent version of the language is 3.0 which was released in conjunction with the .NET Framework 3.5 in 2007. The next proposed version, 4.0, is in development.

Contents [hide]
1 Design goals
2 History
3 Features
4 Common Type system (CTS)
4.1 Categories of datatypes
4.2 Boxing and unboxing
5 Features of C# 2.0
5.1 Partial class
5.2 Generics
5.3 Static classes
5.4 A new form of iterator providing generator functionality
5.5 Anonymous delegates
5.6 Delegate covariance and contravariance
5.7 The accessibility of property accessors can be set independently
5.8 Nullable types
5.9 Null-Coalesce operator
6 Features of C# 3.0
6.1 LINQ (Language-Integrated Query)
6.2 Object initializers
6.3 Collection initializers
6.4 Anonymous types
6.5 Local variable type inference
6.6 Lambda expressions
6.7 Automatic properties
6.8 Extension methods
6.9 Partial methods
7 Features of C# 4.0
7.1 Dynamic member lookup
7.2 Covariant and contravariant generic type parameters
7.3 Optional ref Keyword when using COM
7.4 Optional parameters and named arguments
7.5 Indexed properties
8 Preprocessor
9 Code comments
10 XML documentation system
11 Libraries
12 "Hello, world" example
13 Standardization
13.1 Criticism
14 Implementations
15 Language name
16 See also
16.1 Environments and tools
16.2 Related languages
16.3 Comparisons
17 Notes
18 References
19 External links



[edit] Design goals
The ECMA standard lists these design goals for C#:

C# is intended to be a simple, modern, general-purpose, object-oriented programming language.
Because software robustness, durability and programmer productivity are important, the language should include strong type checking, array bounds checking, detection of attempts to use uninitialized variables, source code portability, and automatic garbage collection.
The language is intended for use in developing software components that can take advantage of distributed environments.
Programmer portability is very important, especially for those programmers already familiar with C and C++.
Support for internationalization is very important.
C# is intended to be suitable for writing applications for both hosted and embedded systems, ranging from the very large that use sophisticated operating systems, down to the very small having dedicated functions.
Although C# applications are intended to be economical with regard to memory and processing power requirements, the language is not intended to compete directly on performance and size with C.

[edit] History
In 1996, Sun Microsystems released the Java programming language with Microsoft soon purchasing a license to implement it in their operating system. Java was originally meant to be a platform independent language, but Microsoft, in their implementation, broke their license agreement[citation needed]and made a few changes that would essentially inhibit Java's platform-independent capabilities[citation needed]. Sun filed a lawsuit and Microsoft settled[citation needed], deciding to create their own version of a partially compiled, partially interpreted object-oriented programming language with syntax closely related to that of C++.

During the development of .NET Framework, the class libraries were originally written in a language/compiler called Simple Managed C (SMC).[2][3][4] In January 1999, Anders Hejlsberg formed a team to build a new language at the time called Cool, which stood for "C like Object Oriented Language".[5] Microsoft had considered keeping the name "Cool" as the final name of the language, but chose not to do so for trademark reasons. By the time the .NET project was publicly announced at the July 2000 Professional Developers Conference, the language had been renamed C#, and the class libraries and ASP.NET runtime had been ported to C#.

C#'s principal designer and lead architect at Microsoft is Anders Hejlsberg, who was previously involved with the design of Turbo Pascal, Borland Delphi, and Visual J++. In interviews and technical papers he has stated that flaws in most major programming languages (e.g. C++, Java, Delphi, and Smalltalk) drove the fundamentals of the Common Language Runtime (CLR), which, in turn, drove the design of the C# programming language itself. Some argue that C# shares roots in other languages.[6].


[edit] Features
Note: The following description is based on the language standard and other documents listed in the external links section.
By design, C# is the programming language that most directly reflects the underlying Common Language Infrastructure (CLI). Most of its intrinsic types correspond to value-types implemented by the CLI framework. However, the language specification does not state the code generation requirements of the compiler: that is, it does not state that a C# compiler must target a Common Language Runtime, or generate Common Intermediate Language (CIL), or generate any other specific format. Theoretically, a C# compiler could generate machine code like traditional compilers of C++ or FORTRAN. However, in practice, all existing compiler implementations target CIL.

Some notable C# distinguishing features are:

There are no global variables or functions. All methods and members must be declared within classes. It is possible, however, to use static methods/variables within public classes instead of global variables/functions.
Local variables cannot shadow variables of the enclosing block, unlike C and C++. Variable shadowing is often considered confusing by C++ texts.
C# supports a strict Boolean datatype, bool. Statements that take conditions, such as while and if, require an expression of a boolean type. While C++ also has a boolean type, it can be freely converted to and from integers, and expressions such as if(a) require only that a is convertible to bool, allowing a to be an int, or a pointer. C# disallows this "integer meaning true or false" approach on the grounds that forcing programmers to use expressions that return exactly bool can prevent certain types of programming mistakes such as if (a = b) (use of = instead of ==).
In C#, memory address pointers can only be used within blocks specifically marked as unsafe, and programs with unsafe code need appropriate permissions to run. Most object access is done through safe object references, which are always either pointing to a valid, existing object, or have the well-defined null value; a reference to a garbage-collected object, or to random block of memory, is impossible to obtain. An unsafe pointer can point to an instance of a value-type, array, string, or a block of memory allocated on a stack. Code that is not marked as unsafe can still store and manipulate pointers through the System.IntPtr type, but cannot dereference them.
Managed memory cannot be explicitly freed, but is automatically garbage collected. Garbage collection addresses memory leaks. C# also provides direct support for deterministic finalization with the using statement (supporting the Resource Acquisition Is Initialization idiom).
Multiple inheritance is not supported, although a class can implement any number of interfaces. This was a design decision by the language's lead architect to avoid complication, avoid dependency hell and simplify architectural requirements throughout CLI.
C# is more typesafe than C++. The only implicit conversions by default are those which are considered safe, such as widening of integers and conversion from a derived type to a base type. This is enforced at compile-time, during JIT, and, in some cases, at runtime. There are no implicit conversions between booleans and integers, nor between enumeration members and integers (except for literal 0, which can be implicitly converted to any enumerated type). Any user-defined conversion must be explicitly marked as explicit or implicit, unlike C++ copy constructors and conversion operators, which are both implicit by default.
Enumeration members are placed in their own scope.
C# provides syntactic sugar for a common pattern of a pair of methods, accessor (getter) and mutator (setter) encapsulating operations on a single attribute of a class, in the form of properties.
Full type reflection and discovery is available.
C# currently (as of 3 June 2008) has 77 reserved words.

[edit] Common Type system (CTS)
C# has a unified type system. This unified type system is called Common Type System (CTS).[7]

A unified type system implies that all types, including primitives such as integers, are subclasses of the System.Object class. For example, every type inherits a ToString() method. For performance reasons, primitive types (and value types in general) are internally allocated on the stack.


[edit] Categories of datatypes
CTS separates datatypes into two categories[7]:

Value types
Reference types
Value types are plain aggregations of data. Instances of value types do not have referential identity nor a referential comparison semantics - equality and inequality comparisons for value types compare the actual data values within the instances, unless the corresponding operators are overloaded. Value types are derived from System.ValueType, always have a default value, and can always be created and copied. Some other limitations on value types are that they cannot derive from each other (but can implement interfaces) and cannot have a default (parameterless) constructor. Examples of value types are some primitive types, such as int (a signed 32-bit integer), float (a 32-bit IEEE floating-point number), char (a 16-bit Unicode codepoint), and System.DateTime (identifies a specific point in time with millisecond precision).

In contrast, reference types have the notion of referential identity - each instance of reference type is inherently distinct from every other instance, even if the data within both instances is the same. This is reflected in default equality and inequality comparisons for reference types, which test for referential rather than structural equality, unless the corresponding operators are overloaded (such as the case for System.String). In general, it is not always possible to create an instance of a reference type, nor to copy an existing instance, or perform a value comparison on two existing instances, though specific reference types can provide such services by exposing a public constructor or implementing a corresponding interface (such as ICloneable or IComparable). Examples of reference types are object (the ultimate base class for all other C# classes), System.String (a string of Unicode characters), and System.Array (a base class for all C# arrays).

Both type categories are extensible with user-defined types.


[edit] Boxing and unboxing
Boxing is the operation of converting a value of a value type into a value of a corresponding reference type.[7]

Example:

int foo = 42; // Value type...
object bar = foo; // foo is boxed to bar.
Unboxing is the operation of converting a value of a reference type (previously boxed) into a value of a value type.[7]

Example:

int foo = 42; // Value type.
object bar = foo; // foo is boxed to bar.
int foo2 = (int)bar; // Unboxed back to value type.

[edit] Features of C# 2.0
New features in C# 2.0 (corresponding to the 3rd edition of the ECMA-334 standard and the .NET Framework 2.0) are:


[edit] Partial class
Partial classes allow implementation of a class to be spread between several files, with each file containing one or more class members. It is primarily useful when parts of a class are automatically generated. For example, the feature is heavily used by code-generating user interface designers in Visual Studio.

file1.cs:

public partial class MyClass
{
public void MyMethod1()
{
// Manually written code
}
}
file2.cs:

public partial class MyClass
{
public void MyMethod2()
{
// Automatically generated code
}
}

[edit] Generics
Generics, or parameterized types, or parametric polymorphism is a .NET 2.0 feature supported by C#. Unlike C++ templates, .NET parameterized types are instantiated at runtime rather than by the compiler; hence they can be cross-language whereas C++ templates cannot. They support some features not supported directly by C++ templates such as type constraints on generic parameters by use of interfaces. On the other hand, C# does not support non-type generic parameters. Unlike generics in Java, .NET generics use reification to make parameterized types first-class objects in the CLI Virtual Machine, which allows for optimizations and preservation of the type information.[8]


[edit] Static classes
Static classes are classes that cannot be instantiated or inherited from, and that only allow static members. Their purpose is similar to that of modules in many procedural languages.


[edit] A new form of iterator providing generator functionality
A new form of iterator that provides generator functionality, using a yield return construct similar to yield in Python.

// Method that takes an iterable input (possibly an array)
// and returns all even numbers.
public static IEnumerable GetEven(IEnumerable numbers)
{
foreach (int i in numbers)
{
if (i % 2 == 0) yield return i;
}
}

[edit] Anonymous delegates
Anonymous delegates provide closure functionality in C#.[9] Code inside the body of an anonymous delegate has full read/write access to local variables, method parameters, and class members in scope of the delegate, excepting out and ref parameters. For example:-

int SumOfArrayElements(int[] array)
{
int sum = 0;
Array.ForEach(
array,
delegate(int x)
{
sum += x;
}
);
return sum;
}

[edit] Delegate covariance and contravariance
Conversions from method groups to delegate types are covariant and contravariant in return and parameter types, respectively. [10]


[edit] The accessibility of property accessors can be set independently
Example:

string status = string.Empty;

public string Status
{
get { return status; } // anyone can get value of this property,
protected set { status = value; } // but only derived classes can change it
}

[edit] Nullable types
Nullable value types (denoted by a question mark, e.g. int? i = null;) which add null to the set of allowed values for any value type. This provides improved interaction with SQL databases, which can have nullable columns of types corresponding to C# primitive types: an SQL INTEGER NULL column type directly translates to the C# int?.

Nullable types received an eleventh-hour improvement at the end of August 2005, mere weeks before the official launch, to improve their boxing characteristics: a nullable variable which is assigned null is not actually a null reference, but rather an instance of struct Nullable with property HasValue equal to false. When boxed, the Nullable instance itself is boxed, and not the value stored in it, so the resulting reference would always be non-null, even for null values. The following code illustrates the corrected flaw:

int? i = null;
object o = i;
if (o == null)
Console.WriteLine("Correct behaviour - runtime version from September 2005 or later");
else
Console.WriteLine("Incorrect behaviour - pre-release runtime (from before September 2005)");
When copied into objects, the official release boxes values from Nullable instances, so null values and null references are considered equal. The late nature of this fix caused some controversy[11] , since it required core-CLR changes affecting not only .NET2, but all dependent technologies (including C#, VB, SQL Server 2005 and Visual Studio 2005).


[edit] Null-Coalesce operator
The ?? operator is called the null-coalescing operator and is used to define a default value for a nullable value types as well as reference types. It returns the left-hand operand if it is not null; otherwise it returns the right operand.[12]

object nullObj = null;
object obj = new Object();
return nullObj ?? obj; // returns obj
The primary use of this operator is to assign a nullable type to a non-nullable type with an easy syntax:

int? i = null;
int j = i ?? 0; // If i is not null, initialize j to i. Else (if i is null), initialize j to 0.

[edit] Features of C# 3.0
C# 3.0 was released on 19 November 2007 as part of .NET Framework 3.5. It includes new features inspired by functional programming languages such as Haskell and ML, and is driven largely by the introduction of the Language Integrated Query (LINQ) pattern to the Common Language Runtime.[13] It is not currently standardized by any standards organisation.


[edit] LINQ (Language-Integrated Query)
LINQ is an extensible, general-purpose query language for many kinds of data sources (including plain object collections, XML documents, databases, and so on) which is tightly integrated with other C# language facilities. The syntax heavily borrows from SQL. An example:

int[] array = { 1, 5, 2, 10, 7 };

// Select squares of all odd numbers in the array sorted in descending order
IEnumerable query = from x in array
where x % 2 == 1
orderby x descending
select x * x;
// Result: 49, 25, 1

[edit] Object initializers
Customer c = new Customer(); c.Name = "James";

can be written

Customer c = new Customer { Name="James" };


[edit] Collection initializers
MyList list = new MyList();
list.Add(1);
list.Add(2);
can be written as

MyList list = new MyList { 1, 2 };
assuming that MyList implements System.Collections.IEnumerable and has a public Add method[14]


[edit] Anonymous types
var x = new { FirstName="James", LastName="Frank" }; [15]


[edit] Local variable type inference
Local variable type inference:

var x = new Dictionary>();
is interchangeable with

Dictionary> x = new Dictionary>();
This feature is not just a convenient syntactic sugar for shorter local variable declarations, but it is also required for the declaration of variables of anonymous types.


[edit] Lambda expressions
Lambda expressions provide a concise way to write first-class anonymous function values. Compare the following C# 2.0 snippet:

listOfFoo.Where(delegate(Foo x) { return x.Size > 10; })
with this C# 3.0 equivalent:

listOfFoo.Where(x => x.Size > 10);
In the above examples, lambda expressions are merely short-hand syntax for anonymous delegates with type inference for parameters and return type. However, depending on the context they are used in, a C# compiler can also transform lambdas into ASTs that can then be processed at run-time. In the example above, if listOfFoo is not a plain in-memory collection, but a wrapper around a database table, it could use this technique to translate the body of the lambda into the equivalent SQL expression for optimized execution. Either way, the lambda expression itself looks exactly the same in the code, so the way it is used at run-time is transparent to the client.


[edit] Automatic properties
The compiler will generate a private instance variable and the appropriate accessor and mutator given code such as: public string Name { get; private set; }


[edit] Extension methods
Extension methods are a form of syntactic sugar providing the illusion of adding new methods to the existing class outside its definition. In practice, an extension method is a static method that is callable as if it was an instance method; the receiver of the call is bound to the first parameter of the method, decorated with keyword this:

public static class StringExtensions
{
public static string Left(this string s, int n)
{
return s.Substring(0, n);
}
}

string s = "foo";
s.Left(3); // same as StringExtensions.Left(s, 3);

[edit] Partial methods
Partial methods allow code generators to generate method declarations as extension points that are only included in the source code compilation if someone actually implements it in another portion of a partial class.[16]


[edit] Features of C# 4.0
This article or section contains information about scheduled or expected future software.
The content may change as the software release approaches and more information becomes available.

The next version of the language, C# 4.0, is under development as of October 2008. Microsoft has announced a list of new language features in C# 4.0 on Microsoft Professional Developers Conference 2008. The major focus of the next version is interoperability with partially or fully dynamically typed languages and frameworks, such as the Dynamic Language Runtime and COM. The following new features were announced:[17]


[edit] Dynamic member lookup
A new pseudo-type dynamic is introduced into the C# type system. It is treated as System.Object, but in addition, any member access (method call, field, property, or indexer access, or a delegate invocation) or application of an operator on a value of such type is permitted without any type checking, and its resolution is postponed until run-time. For example:

// Returns the value of Length property or field of any object
int GetLength(dynamic obj)
{
return obj.Length;
}

GetLength("Hello, world"); // a string has a Length property,
GetLength(new int[] { 1, 2, 3 }); // and so does an array,
GetLength(42); // but not an integer - an exception will be thrown here at run-time
Dynamic method calls are triggered by a value of type "dynamic" as any implicit or explicit parameter (and not just a receiver). For example:

void Print(dynamic obj)
{
Console.WriteLine(obj); // which overload of WriteLine() to call is decided at run-time
}

Print(123); // ends up calling WriteLine(int)
Print("abc"); // ends up calling WriteLine(string)
Dynamic lookup is performed using three distinct mechanisms: COM IDispatch for COM objects, IDynamicObject DLR interface for objects implementing that interface, and Reflection for all other objects. Any C# class can therefore intercept dynamic calls on its instances by implementing IDynamicObject.

In case of dynamic method and indexer calls, overload resolution happens at run-time according to the actual types of the values passed as arguments, but otherwise according to the usual C# overloading resolution rules. Furthermore, in cases where the receiver in a dynamic call is not itself dynamic, run-time overload resolution will only consider the methods that are exposed on the declared compile-time type of the receiver. For example:

class Base
{
void Foo(double x);
}

class Derived : Base
{
void Foo(int x);
}

dynamic x = 123;
Base b = new Derived();
b.Foo(x); // picks Base.Foo(double) because b is of type Base, and Derived.Foo(int) is not exposed
dynamic b1 = b;
b1.Foo(x); // picks Derived.Foo(int)
Any value returned from a dynamic member access is itself of type dynamic. Values of type dynamic are implicitly convertible both from and to any other type. In the code sample above, this permits GetLength function to treat the value returned by a call to Length as an integer without any explicit cast. At run-time, the actual value will be converted to the requested type.


[edit] Covariant and contravariant generic type parameters
Generic interfaces and delegates can have their type parameters marked as covariant or contravariant, using keywords out and in, respectively. These declarations are then respected for type conversions, both implicit and explicit, and both compile-time and run-time. For example, the existing interface IEnumerable has been redefined as follows:

interface IEnumerable
{
IEnumerator GetEnumerator();
}
Therefore, any class that implements IEnumerable for some class Derived is also considered to be compatible with IEnumerable for all classes and interfaces Base that Derived extends, directly, or indirectly. In practice, it makes it possible to write code such as:

void PrintAll(IEnumerable objects)
{
foreach (object o in objects)
{
Console.WriteLine(o);
}
}

IEnumerable strings = new List();
PrintAll(strings); // IEnumerable is implicitly converted to IEnumerable
For contravariance, the existing interface IComparer has been redefined as follows:

public interface IComparer
{
int Compare(T x, T y);
}
Therefore, any class that implements IComparer for some class Base is also considered to be compatible with IComparer for all classes and interfaces Derived that are extended from Base. It makes it possible to write code such as:

IComparer objectComparer = GetComparer();
IComparer stringComparer = objectComparer;

[edit] Optional ref Keyword when using COM
The ref keyword for callers of methods is now optional when calling into methods supplied by COM interfaces. Given a COM method with the signature

void Increment(ref int x);
the invocation can now be written as either

Increment(0); // no need for "ref" or a place holder variable any more
or

int x = 0;
Increment(ref x);

[edit] Optional parameters and named arguments
C# 4.0 introduces optional parameters with default values as seen in C++. For example:

void Increment(ref int x, int dx = 1)
{
x += dx;
}

int x = 0;
Increment(ref x); // dx takes the default value of 1
Increment(x, 2); // dx takes the value 2
In addition, to complement optional parameters, it is possible to explicitly specify parameter names in method calls, allowing to selectively pass any given subset of optional parameters for a method. The only restriction is that named parameters must be placed after the unnamed parameters. Parameter names can be specified for both optional and required parameters, and can be used to improve readability or arbitrarily reorder arguments in a call. For example:

Stream OpenFile(string name, FileMode mode = FileMode.Open, FileAccess access = FileAccess.Read) { ... }

OpenFile("file.txt"); // use default values for both "mode" and "access"
OpenFile("file.txt", mode: FileMode.Create); // use default value for "access"
OpenFile("file.txt", access: FileAccess.Read); // use default value for "mode"
OpenFile(name: "file.txt", access: FileAccess.Read, mode: FileMode.Create); // name all parameters for extra readability,
// and use order different from method declaration
Optional parameters make interoperating with COM easier. Previously, C# had to pass in every parameter in the method of the COM component, even those that are optional. For example:

object fileName = "Test.docx";
object missing = System.Reflection.Missing.Value;

doc.SaveAs(ref fileName,
ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing);
With support for optional parameters, the code can be shortened as

doc.SaveAs(ref fileName);

[edit] Indexed properties
Indexed properties (and default properties) of COM objects are now recognised, but C# objects still do not support them.





[edit] Preprocessor
C# features "preprocessor directives"[18] (though it does not have an actual preprocessor) based on the C preprocessor that allow programmers to define symbols but not macros. Conditionals such as #if, #endif, and #else are also provided. Directives such as #region give hints to editors for code folding.


[edit] Code comments
C# utilizes a double forward slash (//) to indicate the rest of the line is a comment.

public class Foo
{
// a comment
public static void Bar(int firstParam) {} //Also a comment
}
Multi-line comments can be indicated by a starting forward slash/asterisk (/*) and ending asterisk/forward slash (*/).

public class Foo
{
/* A Multi-Line
comment */
public static void Bar(int firstParam) {}
}

[edit] XML documentation system
C#'s documentation system is similar to Java's Javadoc, but based on XML. Two methods of documentation are currently supported by the C# compiler.

Single-line comments, such as those commonly found in Visual Studio generated code, are indicated on a line beginning with ///.

public class Foo
{
/// A summary of the method.
/// A description of the parameter.
/// Remarks about the method.
public static void Bar(int firstParam) {}
}
Multi-line comments, while defined in the version 1.0 language specification, were not supported until the .NET 1.1 release.[19] These comments are designated by a starting forward slash/asterisk/asterisk (/**) and ending asterisk/forward slash (*/)[20].

public class Foo
{
/** A summary of the method.
* A description of the parameter.
* Remarks about the method. */
public static void Bar(int firstParam) {}
}
Note there are some stringent criteria regarding white space and XML documentation when using the forward slash/asterisk/asterisk (/**) technique.

This code block:

/**
*
* A summary of the method.
*/
produces a different XML comment than this code block[20]:

/**
*
A summary of the method.
*/
Syntax for documentation comments and their XML markup is defined in a non-normative annex of the ECMA C# standard. The same standard also defines rules for processing of such comments, and their transformation to a plain XML document with precise rules for mapping of CLI identifiers to their related documentation elements. This allows any C# IDE or other development tool to find documentation for any symbol in the code in a certain well-defined way.


[edit] Libraries
The C# specification details a minimum set of types and class libraries that the compiler expects to have available. In practice, C# is most often used with some implementation of the Common Language Infrastructure (CLI), which is standardized as ECMA-335 Common Language Infrastructure (CLI).


[edit] "Hello, world" example
The following is a very simple C# program, a version of the classic "Hello world" example:

class ExampleClass
{
static void Main()
{
System.Console.WriteLine("Hello, world!");
}
}
The effect is to write the following text to the output console:

Hello, world!
Each line has a purpose:

class ExampleClass
Above is a class definition. Everything between the following pair of braces describes ExampleClass.

static void Main()
This declares the class member method where the program begins execution. The .NET runtime calls the Main method. (Note: Main may also be called from elsewhere, e.g. from the code Main() in another method of ExampleClass.) The static keyword makes the method accessible without an instance of ExampleClass. Each console application's Main entry point must be declared static. Otherwise, the program would require an instance, but any instance would require a program. To avoid that irresolvable circular dependency, C# compilers processing console applications (like above) report an error if there is no static Main method. The void keyword declares that Main has no return value (see also side effect).

Console.WriteLine("Hello, world!");
This line writes the output. Console is a static class in the System namespace. It provides an interface to the standard input, output, and error streams for console applications. The program calls the Console method WriteLine, which displays on the console a line with the argument, the string "Hello, world!".


[edit] Standardization
In August, 2000, Microsoft Corporation, Hewlett-Packard and Intel Corporation co-sponsored the submission of specifications for C# as well as the Common Language Infrastructure (CLI) to the standards organization ECMA International. In December 2001 , ECMA released ECMA-334 C# Language Specification. C# became an ISO standard in 2003 (ISO/IEC 23270:2006 - Information technology -- Programming languages -- C#). ECMA had previously adopted equivalent specifications as the 2nd edition of C#, in December, 2002.

In June 2005, ECMA approved edition 3 of the C# specification, and updated ECMA-334. Additions included partial classes, anonymous methods, nullable types, and generics (similar to C++ templates).

In July 2005, ECMA submitted the standards and related TRs to ISO/IEC JTC 1 via the latter's Fast-Track process. This process usually takes 6-9 months.


[edit] Criticism
Although the C# language definition and the CLI are standardized under ISO and ECMA standards, the CLI is only a part of Microsoft's Base Class Library (BCL), which also contains non-standardized classes that are used by many C# programs (some extended IO, User Interface, Web services, etc). Furthermore, parts of the BCL have been patented by Microsoft,[21][22] which may deter independent implementations of the full framework, as only the standardized portions have RAND protection from patent claims.


[edit] Implementations
The most commonly used C# compiler is Microsoft Visual C#.

C# compilers are also provided with:

Microsoft's Rotor project (currently called Shared Source Common Language Infrastructure) (licensed for educational and research use only) provides a shared source implementation of the CLR runtime and a C# compiler, and a subset of the required Common Language Infrastructure framework libraries in the ECMA specification.
The Mono project provides an open source C# compiler, a complete open source implementation of the Common Language Infrastructure including the required framework libraries as they appear in the ECMA specification, and a nearly complete implementation of the Microsoft proprietary .NET class libraries up to .NET 2.0, but not specific .NET 3.0 and .NET 3.5 libraries, as of Mono 2.0.
The DotGNU project also provides an open source C# compiler, a nearly complete implementation of the Common Language Infrastructure including the required framework libraries as they appear in the ECMA specification, and subset of some of the remaining Microsoft proprietary .NET class libraries up to .NET 2.0 (those not documented or included in the ECMA specification but included in Microsoft's standard .NET Framework distribution).
DotNetAnywhere [3] - Micro Framework-like Common Language Runtime, aimed on embedded systems, supports almost all C# 2.0 specifications.

[edit] Language name

C sharp typographical convention
C sharp musical noteThe name "C sharp" was inspired from musical notation where a sharp indicates that the written note should be made a half-step higher in pitch.[23] This is similar to the language name of C++, where the two "+" symbols indicate that a variable should be incremented by 1. The Sharp symbol also resembles a ligature of four "+" symbols (arranged in a two-by-two grid), furthermore implying that the language is an increment of C++.

Due to technical limitations of display (fonts, browsers, etc.) and the fact that the sharp symbol (♯, U+266F, MUSIC SHARP SIGN) is not present on the standard keyboard, the number sign (#, U+0023, NUMBER SIGN) was chosen to represent the sharp symbol in the written name of the programming language.[24] This convention is reflected in the ECMA-334 C# Language Specification.[25] However, when it is practical to do so (for example, in advertising or in box art[26]), Microsoft will use the intended musical sharp symbol.

The "sharp" suffix has been used by a number of other .NET languages that are variants of existing languages, including J# (a .NET language also designed by Microsoft which is derived from Java 1.1), A# (from Ada), and the functional F#.[27] The original implementation of Eiffel for .NET was called Eiffel#,[28] a name since retired since the full Eiffel language is now supported. The suffix is also sometimes used for libraries, such as Gtk# (a .NET wrapper for GTK+ and other GNOME libraries), Cocoa# (a wrapper for Cocoa) and Qt# (a .NET language binding for the Qt toolkit).


[edit] See also
C# Syntax

[edit] Environments and tools
Microsoft Visual Studio, IDE for C#
SharpDevelop, an open-source C# IDE for Windows
MonoDevelop, an open-source C# IDE for Linux
QuickSharp 2008, a simplified development environment for C#
Morfik C#, a C# to JavaScript compiler complete with IDE and framework for Web application development.
Baltie, an educational IDE for children and students with little or no programming experience
Borland Turbo C Sharp

[edit] Related languages

Spec#
Sing#
Parallel C#

[edit] Comparisons
Alphabetical list of programming languages
Comparison of programming languages

[edit] Notes
^ In Java 5.0, several features (foreach, autoboxing, varargs, annotations and enums) were introduced, after proving themselves useful in the C# language (with only minor differences in name and implementation). [1][2]
^ "Jason Zander on the history of .NET". http://blogs.msdn.com/jasonz/archive/2007/11/23/couple-of-historical-facts.aspx. Retrieved on 2008-02-21.
^ "C# 3.0 New Features". http://www.learnitonweb.com/Articles/ReadArticle.aspx?contId=4&page=1. Retrieved on 2008-06-08.
^ "Scott Guthrie on the origins of ASP.Net". http://aspadvice.com/blogs/rbirkby/commentrss.aspx?PostID=24972. Retrieved on 2008-02-21.
^ Naomi Hamilton EDIT MODE: (October 1, 2008). "The A-Z of Programming Languages: C#". http://www.computerworld.com.au/index.php/id;1149786074;fp;;fpid;;pf;1. Retrieved on 2008-10-01.
^ "Programming language history chart". http://www.levenez.com/lang/history.html.
^ a b c d Archer, Part 2, Chapter 4:The Type System
^ An Introduction to C# Generics
^ Anonymous Methods (C#)
^ Covariance and Contravariance in Delegates (C#)
^ Somasegar (August 11, 2005). "Nulls not missing anymore". Somasegar's WebLog. MSDN. http://blogs.msdn.com/somasegar/archive/2005/08/11/450640.aspx. Retrieved on 2008-11-05.
^ "?? Operator (C# Reference)". Microsoft. http://msdn.microsoft.com/en-us/library/ms173224.aspx. Retrieved on 2008-11-23.
^ Tim Anderson (November 14, 2006). "C# pulling ahead of Java - Lead architect paints rosy C# picture". Reg Developer. The Register. http://www.regdeveloper.co.uk/2006/11/14/c-sharp_hejlsberg/. Retrieved on 2007-01-20.
^ The Mellow Musings of Dr. T : What is a collection?
^ Anonymous Types (C# Programming Guide)
^ "Partial Methods". http://blogs.msdn.com/vbteam/archive/2007/03/27/partial-methods.aspx. Retrieved on 2007-10-06.
^ Mads Torgersen. "New features in C# 4.0". http://code.msdn.microsoft.com/csharpfuture/Release/ProjectReleases.aspx?ReleaseId=1686. Retrieved on 2008-10-28.
^ C# Preprocessor Directives
^ Anson Horton (2007-09-11). "C# XML documentation comments FAQ". http://blogs.msdn.com/ansonh/archive/2006/09/11/750056.aspx. Retrieved on 2007-12-11.
^ a b http://msdn2.microsoft.com/en-us/library/5fz4y783(VS.71).aspx Delimiters for Documentation Tags
^ See .NET Framework
^ See Mono and Microsoft’s patents
^ "C#/.NET History Lesson". 2008-03-25. http://www.jameskovacs.com/blog/CNETHistoryLesson.aspx.
^ "Microsoft C# FAQ". http://msdn.microsoft.com/vcsharp/previous/2002/FAQ/default.aspx. Retrieved on 2008-03-25.
^ Standard ECMA-334 C# Language Specification. 4th edition (June 2006).
^ Microsoft.com
^ "Microsoft F# FAQ". http://research.microsoft.com/fsharp/faq.aspx.
^ http://msdn.microsoft.com/en-us/library/ms973898.aspx

[edit] References
Archer, Tom (2001). Inside C#. Microsoft Press. ISBN 0-7356-1288-9.
C# Language Pocket Reference. O' Reilly. 2002. ISBN 0-596-00429-X.
Petzold, Charles (2002). Programming Microsoft Windows with C#. Microsoft Press. ISBN 0-7356-1370-2.

[edit] External links
This article's external links may not follow Wikipedia's content policies or guidelines. Please improve this article by removing excessive or inappropriate external links.
Wikibooks has a book on the topic of
C Sharp Programming
C# Language (MSDN)
C# Programming Guide (MSDN)
C# Specification (MSDN)
ECMA-334 C# Language Specification - hyperlinked
ECMA-334 C# Language SpecificationPDF (5.59 MB)
ISO C# Language Specification - Purchase version or free version.
Microsoft Visual C# .NET
CSharpUniversity.com - High Quality C# and ASP.NET lessons with source code and videos - Unique and fun interactive learning website with blog.

Monday, March 9, 2009

Assembly Language

Assembly language
From Wikipedia, the free encyclopedia
Jump to: navigation, search
See the terminology section below for information regarding inconsistent use of the terms assembly and assembler.
An assembly language is a low-level language for programming computers. It implements a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture. This representation is usually defined by the hardware manufacturer, and is based on abbreviations (called mnemonics) that help the programmer remember individual instructions, registers, etc. An assembly language is thus specific to a certain physical or virtual computer architecture (as opposed to most high-level languages, which are usually portable).

Assembly languages were first developed in the 1950s, when they were referred to as second generation programming languages. They eliminated much of the error-prone and time-consuming first-generation programming needed with the earliest computers, freeing the programmer from tedium such as remembering numeric codes and calculating addresses. They were once widely used for all sorts of programming. However, by the 1980s (1990s on small computers), their use had largely been supplanted by high-level languages, in the search for improved programming productivity. Today, assembly language is used primarily for direct hardware manipulation, access to specialized processor instructions, or to address critical performance issues. Typical uses are device drivers, low-level embedded systems, and real-time systems.

A utility program called an assembler is used to translate assembly language statements into the target computer's machine code. The assembler performs a more or less isomorphic translation (a one-to-one mapping) from mnemonic statements into machine instructions and data. (This is in contrast with high-level languages, in which a single statement generally results in many machine instructions.)

Many sophisticated assemblers offer additional mechanisms to facilitate program development, control the assembly process, and aid debugging. In particular, most modern assemblers (although many have been available for more than 40 years already) include a macro facility (described below), and are called macro assemblers.

Contents [hide]
1 Key concepts
1.1 Assembler
1.2 Assembly language
2 Language design
2.1 Basic elements
2.2 Macros
2.3 Support for structured programming
3 Use of assembly language
3.1 Historical perspective
3.2 Current usage
3.3 Typical applications
4 Related terminology
5 Further details
6 Example listing of assembly language source code
7 See also
8 References
9 Further reading
10 External links
10.1 Software



[edit] Key concepts

[edit] Assembler
Compare with: Microassembler.
Typically a modern assembler creates object code by translating assembly instruction mnemonics into opcodes, and by resolving symbolic names for memory locations and other entities.[1] The use of symbolic references is a key feature of assemblers, saving tedious calculations and manual address updates after program modifications. Most assemblers also include macro facilities for performing textual substitution—e.g., to generate common short sequences of instructions to run inline, instead of in a subroutine.

Assemblers are generally simpler to write than compilers for high-level languages, and have been available since the 1950s. Modern assemblers, especially for RISC based architectures, such as MIPS, Sun SPARC, HP PA-RISC and x86(-64), optimize instruction scheduling to exploit the CPU pipeline efficiently.

More sophisticated high-level assemblers provide language abstractions such as:

Advanced control structures
High-level procedure/function declarations and invocations
High-level abstract data types, including structures/records, unions, classes, and sets
Sophisticated macro processing
Object-Oriented features such as encapsulation, polymorphism, inheritance, interfaces
See Language design below for more details.

Note that, in normal professional usage, the term assembler is often used ambiguously: It is frequently used to refer to an assembly language itself, rather than to the assembler utility. Thus: "CP/CMS was written in S/360 assembler" as opposed to "ASM-H was a widely-used S/370 assembler."[citation needed]


[edit] Assembly language
A program written in assembly language consists of a series of instructions--mnemonics that correspond to a stream of executable instructions, when translated by an assembler, that can be loaded into memory and executed.

For example, an x86/IA-32 processor can execute the following binary instruction as expressed in machine language (see x86 assembly language):

Binary: 10110000 01100001 (Hexadecimal: B0 61)
The equivalent assembly language representation is easier to remember (example in Intel syntax, more mnemonic):

MOV AL, #61h
This instruction means:

Move the value 61h (or 97 decimal; the h-suffix means hexadecimal; the pound sign means move the immediate value, not location) into the processor register named "AL".
The mnemonic "mov" represents the opcode 1011 which moves the value in the second operand into the register indicated by the first operand. The mnemonic was chosen by the instruction set designer to abbreviate "move", making it easier for the programmer to remember. A comma-separated list of arguments or parameters follows the opcode; this is a typical assembly language statement.

In practice many programmers drop the word mnemonic and, technically incorrectly, call "mov" an opcode. When they do this they are referring to the underlying binary code which it represents. To put it another way, a mnemonic such as "mov" is not an opcode, but as it symbolizes an opcode, one might refer to "the opcode mov" for example when one intends to refer to the binary opcode it symbolizes rather than to the symbol--the mnemonic--itself. As few modern programmers have need to be mindful of actually what binary patterns are the opcodes for specific instructions, the distinction has in practice become a bit blurred among programmers but not among processor designers.

Transforming assembly into machine language is accomplished by an assembler, and the reverse by a disassembler. Unlike in high-level languages, there is usually a one-to-one correspondence between simple assembly statements and machine language instructions. However, in some cases, an assembler may provide pseudoinstructions which expand into several machine language instructions to provide commonly needed functionality. For example, for a machine that lacks a "branch if greater or equal" instruction, an assembler may provide a pseudoinstruction that expands to the machine's "set if less than" and "branch if zero (on the result of the set instruction)". Most full-featured assemblers also provide a rich macro language (discussed below) which is used by vendors and programmers to generate more complex code and data sequences.

Each computer architecture and processor architecture has its own machine language. On this level, each instruction is simple enough to be executed using a relatively small number of electronic circuits. Computers differ by the number and type of operations they support. For example, a new 64-bit machine would have different circuitry from a 32-bit machine. They may also have different sizes and numbers of registers, and different representations of data types in storage. While most general-purpose computers are able to carry out essentially the same functionality, the ways they do so differ; the corresponding assembly languages reflect these differences.

Multiple sets of mnemonics or assembly-language syntax may exist for a single instruction set, typically instantiated in different assembler programs. In these cases, the most popular one is usually that supplied by the manufacturer and used in its documentation.


[edit] Language design

[edit] Basic elements
Instructions (statements) in assembly language are generally very simple, unlike those in high-level languages. Each instruction typically consists of an operation or opcode plus zero or more operands. Most instructions refer to a single value, or a pair of values. Generally, an opcode is a symbolic name for a single executable machine language instruction. Operands can be either immediate (typically one byte values, coded in the instruction itself) or the addresses of data located elsewhere in storage. This is determined by the underlying processor architecture: the assembler merely reflects how this architecture works.

Most modern assemblers also support pseudo-operations, which are directives obeyed by the assembler at assembly time instead of the CPU at run time. (For example, pseudo-ops would be used to reserve storage areas and optionally set their initial contents.) The names of pseudo-ops often start with a dot to distinguish them from machine instructions.

Some assemblers also support pseudo-instructions, which generate two or more machine instructions.

Symbolic assemblers allow programmers to associate arbitrary names (labels or symbols) with memory locations. Usually, every constant and variable is given a name so instructions can reference those locations by name, thus promoting self-documenting code. In executable code, the name of each subroutine is associated with its entry point, so any calls to a subroutine can use its name. Inside subroutines, GOTO destinations are given labels. Some assemblers support local symbols which are lexically distinct from normal symbols (e.g., the use of "10$" as a GOTO destination).

Most assemblers provide flexible symbol management, allowing programmers to manage different namespaces, automatically calculate offsets within data structures, and assign labels that refer to literal values or the result of simple computations performed by the assembler. Labels can also be used to initialize constants and variables with relocatable addresses.

Assembly languages, like most other computer languages, allow comments to be added to assembly source code that are ignored by the assembler. Good use of comments is even more important with assembly code than with higher-level languages, as the meaning of a sequence of instructions is harder to decipher from the code itself.

Wise use of these facilities can greatly simplify the problems of coding and maintaining low-level code. Raw assembly source code as generated by compilers or disassemblers — code without any comments, meaningful symbols, or data definitions — is quite difficult to read when changes must be made.


[edit] Macros
Many assemblers support macros, programmer-defined symbols that stand for some sequence of text lines. This sequence of text lines may include a sequence of instructions, or a sequence of data storage pseudo-ops. Once a macro has been defined using the appropriate pseudo-op, its name may be used in place of a mnemonic. When the assembler processes such a statement, it replaces the statement with the text lines associated with that macro, then processes them just as though they had appeared in the source code file all along (including, in better assemblers, expansion of any macros appearing in the replacement text).

Since macros can have 'short' names but expand to several or indeed many lines of code, they can be used to make assembly language programs appear to be much shorter (require less lines of source code from the application programmer - as with a higher level language). They can also be used to add higher levels of structure to assembly programs, optionally introduce embedded de-bugging code via parameters and other similar features.

Many assemblers have built-in macros for system calls and other special code sequences.

Macro assemblers often allow macros to take parameters. Some assemblers include quite sophisticated macro languages, incorporating such high-level language elements as optional parameters, symbolic variables, conditionals, string manipulation, and arithmetic operations, all usable during the execution of a given macros, and allowing macros to save context or exchange information. Thus a macro might generate a large number of assembly language instructions or data definitions, based on the macro arguments. This could be used to generate record-style data structures or "unrolled" loops, for example, or could generate entire algorithms based on complex parameters. An organization using assembly language that has been heavily extended using such a macro suite can be considered to be working in a higher-level language, since such programmers are not working with a computer's lowest-level conceptual elements.

Macros were used to customize large scale software systems for specific customers in the mainframe era and were also used by customer personnel to satisfy their employers' needs by making specific versions of manufacturer operating systems; this was done, for example, by systems programmers working with IBM's Conversational Monitor System/Virtual Machine (CMS/VM) and with IBM's "real time transaction processing" add-on, Customer Information Control System, CICS and the airline/financial system that began in the 1970s and still runs many large Global Distribution Systems (GDS) and credit card systems today, TPF.

It was also possible to use solely the macro processing capabilities of an assembler to generate code written in completely different languages, for example, to generate a version of a program in Cobol using a pure macro assembler program containing lines of Cobol code inside assembly time operators instructing the assembler to generate arbitrary code.

This was because, as was realized in the 1970s, the concept of "macro processing" is independent of the concept of "assembly", the former being in modern terms more word processing, text processing, than generating object code. The concept of macro processing in fact appeared in and appears in the C programming language, which supports "preprocessor instructions" to set variables, and make conditional tests on their values. Note that unlike certain previous macro processors inside assemblers, the C preprocessor was not Turing-complete because it lacked the ability to either loop or "go to", the latter allowing the programmer to loop.

Despite the power of macro processing, it fell into disuse in high level languages while remaining a perennial for assemblers.

This was because many programmers were rather confused by macro parameter substitution and did not disambiguate macro processing from assembly and execution.

Macro parameter substitution is strictly by name: at macro processing time, the value of a parameter is textually substituted for its name. The most famous class of bugs resulting was the use of a parameter that itself was an expression and not a simple name when the macro writer expected a name. In the macro: foo: macro a load a*b the intention was that the caller would provide the name of a variable, and the "global" variable or constant b would be used to multiply "a". If foo is called with the parameter a-c, an unexpected macro expansion occurs.

To avoid this, users of macro processors learned to religiously parenthesize formal parameters inside macro definitions, and callers had to do the same to their "actual" parameters.

PL/I and C feature macros, but this facility was underused or dangerous when used because they can only manipulate text. On the other hand, homoiconic languages, such as Lisp, Prolog, and Forth, retain the power of assembly language macros because they are able to manipulate their own code as data.


[edit] Support for structured programming
Some assemblers have incorporated structured programming elements to encode execution flow. The earliest example of this approach was in the Concept-14 macro set developed by Marvin Zloof at IBM's Thomas Watson Research Center, which extended the S/370 macro assembler with IF/ELSE/ENDIF and similar control flow blocks. This was a way to reduce or eliminate the use of GOTO operations in assembly code, one of the main factors causing spaghetti code in assembly language. This approach was widely accepted in the early 80s (the latter days of large-scale assembly language use).

A curious design was A-natural, a "stream-oriented" assembler for 8080/Z80 processors[citation needed] from Whitesmiths Ltd. (developers of the Unix-like Idris operating system, and what was reported to be the first commercial C compiler). The language was classified as an assembler, because it worked with raw machine elements such as opcodes, registers, and memory references; but it incorporated an expression syntax to indicate execution order. Parentheses and other special symbols, along with block-oriented structured programming constructs, controlled the sequence of the generated instructions. A-natural was built as the object language of a C compiler, rather than for hand-coding, but its logical syntax won some fans.

There has been little apparent demand for more sophisticated assemblers since the decline of large-scale assembly language development.[2] In spite of that, they are still being developed and applied in cases where resource constraints or peculiarities in the target system's architecture prevent the effective use of higher-level languages.[3]


[edit] Use of assembly language

[edit] Historical perspective
Historically, a large number of programs have been written entirely in assembly language. Operating systems were almost exclusively written in assembly language until the widespread acceptance of C in the 1970s and early 1980s. Many commercial applications were written in assembly language as well, including a large amount of the IBM mainframe software written by large corporations. COBOL and FORTRAN eventually displaced much of this work, although a number of large organizations retained assembly-language application infrastructures well into the 90s.

Most early microcomputers relied on hand-coded assembly language, including most operating systems and large applications. This was because these systems had severe resource constraints, imposed idiosyncratic memory and display architectures, and provided limited, buggy system services. Perhaps more important was the lack of first-class high-level language compilers suitable for microcomputer use. A psychological factor may have also played a role: the first generation of microcomputer programmers retained a hobbyist, "wires and pliers" attitude.

In a more commercial context, the biggest reasons for using assembly language were size, speed, and reliability: the writers of Cardbox-Plus said simply "we use assembler because then all the bugs are ours". This held true for 8-bit versions of the program, which had no bugs at all, but ironically it turned out to be false with 16 bits: Cardbox-Plus 2.0 had to be upgraded to Cardbox-Plus 2.1 because a bug in Microsoft's macro assembler caused Cardbox-Plus to index the number "-0" differently from the number "0".[citation needed]

Typical examples of large assembly language programs from this time are the MS-DOS operating system, the early IBM PC spreadsheet program Lotus 1-2-3, and almost all popular games for the Atari 800 family of home computers. Even into the 1990s, most console video games were written in assembly, including most games for the Mega Drive/Genesis and the Super Nintendo Entertainment System[citation needed]. According to some industry insiders, the assembly language was the best computer language to use to get the best performance out of the Sega Saturn, a console that was notoriously challenging to develop and program games for [4]. The popular arcade game NBA Jam (1993) is another example. On the Commodore 64, Amiga, Atari ST, as well as ZX Spectrum home computers, assembler has long been the primary development language. This was in large part due to the fact that BASIC dialects on these systems offered insufficient execution speed, as well as insufficient facilities to take full advantage of the available hardware on these systems. Some systems, most notably Amiga, even have IDEs with highly advanced debugging and macro facilities, such as the freeware ASM-One assembler, comparable to that of Microsoft Visual Studio facilities (ASM-One predates Microsoft Visual Studio).

The Assembler for the VIC-20 was written by Don French and published by French Silk. At 1639 bytes in length, its author believes it is the smallest symbolic assembler ever written. The assembler supported the usual symbolic addressing and the definition of character strings or hex strings. It also allowed address expressions which could be combined with addition, subtraction, multiplication, division, logical AND, logical OR, and exponentiation operators.[5]


[edit] Current usage
There have always been debates over the usefulness and performance of assembly language relative to high-level languages. Assembly language has specific niche uses where it is important; see below. But in general, modern optimizing compilers are claimed to render high-level languages into code that can run as fast as hand-written assembly, despite some counter-examples that can be created. The complexity of modern processors makes effective hand-optimization increasingly difficult.[6] Moreover, and to the dismay of efficiency lovers, increasing processor performance has meant that most CPUs sit idle most of the time, with delays caused by predictable bottlenecks such as I/O operations and paging. This has made raw code execution speed a non-issue for most programmers.

Here are some situations in which practitioners might choose to use assembly language:

When a stand-alone binary executable is required, i.e. one that must execute without recourse to the run-time components or libraries associated with a high-level language; this is perhaps the most common situation. These are embedded programs that store only a small amount of memory and the device is intended to do single purpose tasks. Such examples consist of telephones, automobile fuel and ignition systems, air-conditioning control systems, security systems, and sensors.
When interacting directly with the hardware, for example in device drivers.
When using processor-specific instructions not exploited by or available to the compiler. A common example is the bitwise rotation instruction at the core of many encryption algorithms.
Embedded systems.
When extreme optimization is required, e.g., in an inner loop in a processor-intensive algorithm. Some game programmers are experts at writing code that takes advantage of the capabilities of hardware features in systems enabling the games to run faster.
When a system with severe resource constraints (e.g., an embedded system) must be hand-coded to maximize the use of limited resources; but this is becoming less common as processor price/performance improves
When no high-level language exists, e.g., on a new or specialized processor
Real-time programs that need precise timing and responses, such as simulations, flight navigation systems, and medical equipment. (For example, in a fly-by-wire system, telemetry must be interpreted and acted upon within strict time constraints. Such systems must eliminate sources of unpredictable delays – such as may be created by interpreted languages, automatic garbage collection, paging operations, or preemptive multitasking. Some higher-level languages incorporate run-time components and operating system interfaces that can introduce such delays. Choosing assembly or lower-level languages for such systems gives the programmer greater visibility and control over processing details.)
When complete control over the environment is required (for example in extremely high security situations, where nothing can be taken for granted).
When writing computer viruses, bootloaders, certain device drivers, or other items very close to the hardware or low-level operating system.
When reverse-engineering existing binaries, which may or may not have originally been written in a high-level language, for example when cracking copy protection of proprietary software.
Reverse engineering and modification of video games (known as ROM Hacking), commonly done to games for Nintendo hardware such as the SNES and NES, is possible with a range of techniques, of which the most widely employed is altering the program code at the assembly language level.
Assembly language lends itself well to applications requiring Self modifying code.
Assembly language is sometimes used for writing games and other software for graphing calculators.[7]
Finally, compiler writers usually write software that generates assembly code, and should therefore be expert assembly language programmers themselves.
Nevertheless, assembly language is still taught in most Computer Science and Electronic Engineering programs. Although few programmers today regularly work with assembly language as a tool, the underlying concepts remain very important. Such fundamental topics as binary arithmetic, memory allocation, stack processing, character set encoding, interrupt processing, and compiler design would be hard to study in detail without a grasp of how a computer operates at the hardware level. Since a computer's behavior is fundamentally defined by its instruction set, the logical way to learn such concepts is to study an assembly language. Most modern computers have similar instruction sets. Therefore, studying a single assembly language is sufficient to learn: i) The basic concepts; ii) To recognize situations where the use of assembly language might be appropriate; and iii) To see how efficient executable code can be created from high-level languages.[8]


[edit] Typical applications
Hard-coded assembly language is typically used in a system's boot ROM (BIOS on IBM-compatible PC systems). This low-level code is used, among other things, to initialize and test the system hardware prior to booting the OS, and is stored in ROM. Once a certain level of hardware initialization has taken place, execution transfers to other code, typically written in higher level languages; but the code running immediately after power is applied is usually written in assembly language. The same is true of most boot loaders.

Many compilers render high-level languages into assembly first before fully compiling, allowing the assembly code to be viewed for debugging and optimization purposes. Relatively low-level languages, such as C, often provide special syntax to embed assembly language directly in the source code. Programs using such facilities, such as the Linux kernel, can then construct abstractions utilizing different assembly language on each hardware platform. The system's portable code can then utilize these processor-specific components through a uniform interface.

Assembly language is also valuable in reverse engineering, since many programs are distributed only in machine code form, and machine code is usually easy to translate into assembly language and carefully examine in this form, but very difficult to translate into a higher-level language. Tools such as the Interactive Disassembler make extensive use of disassembly for such a purpose.

A particular niche that makes use of assembly language is the demoscene. Certain competitions require the contestants to restrict their creations to a very small size (e.g. 256B, 1KB, 4KB or 64 KB), and assembly language is the language of choice to achieve this goal.[9] When resources, particularly CPU-processing constrained systems, like the earlier Amiga models, and the Commodore 64, are a concern, assembler coding is a must: optimized assembler code is written "by hand" and instructions are sequenced manually by the coders in an attempt to minimize the number of CPU cycles used; the CPU constraints are so great that every CPU cycle counts. However, using such techniques has enabled systems like the Commodore 64 to produce real-time 3D graphics with advanced effects, a feat which might be considered unlikely or even impossible for a system with a 0.99MHz processor.


[edit] Related terminology
Assembly language or assembler language is commonly called assembly, assembler, ASM, or symbolic machine code. A generation of IBM mainframe programmers called it BAL for Basic Assembly Language.
Note: Calling the language assembler is of course potentially confusing and ambiguous, since this is also the name of the utility program that translates assembly language statements into machine code. Some may regard this as imprecision or error. However, this usage has been common among professionals and in the literature for decades.[10] Similarly, some early computers called their assembler its assembly program.[11])
The computational step where an assembler is run, including all macro processing, is known as assembly time.
The use of the word assembly dates from the early years of computers (cf. short code, speedcode).
A cross assembler (see cross compiler) produces code using one type of processor, which runs on a different type of processor. This technology is particularly important when developing software for new processors, or when developing for embedded systems. This allows, for instance, a 32-bit x86 processor to assemble code to run on a 64-bit x64 processor.
An assembler directive is a command given to an assembler. These directives may do anything from telling the assembler to include other source files, to telling it to allocate memory for constant data.

[edit] Further details
For any given personal computer, mainframe, embedded system, and game console, both past and present, at least one--possibly dozens--of assemblers have been written. For some examples, see the list of assemblers.

On Unix systems, the assembler is traditionally called as, although it is not a single body of code, being typically written anew for each port. A number of Unix variants use GAS.

Within processor groups, each assembler has its own dialect. Sometimes, some assemblers can read another assembler's dialect, for example, TASM can read old MASM code, but not the reverse. FASM and NASM have similar syntax, but each support different macros that could make them difficult to translate to each other. The basics are all the same, but the advanced features will differ.[12]

Also, assembly can sometimes be portable across different operating systems on the same type of CPU. Calling conventions between operating systems often differ slightly or not at all, and with care it is possible to gain some portability in assembly language, usually by linking with a C library that does not change between operating systems.

For example, many things in libc depend on the preprocessor to do OS-specific, C-specific things to the program before compiling. In fact, some functions and symbols are not even guaranteed to exist outside of the preprocessor. Worse, the size and field order of structs, as well as the size of certain typedefs such as off_t, are entirely unavailable in assembly language without help from a configure script, and differ even between versions of Linux, making it impossible to portably call functions in libc other than ones that only take simple integers and pointers as parameters. To address this issue, FASMLIB project provides a portable assembly library for Win32 and Linux platforms, but it is yet very incomplete.[13]

Some higher level computer languages, such as C and Borland Pascal, support inline assembly where relatively brief sections of assembly code can be embedded into the high level language code. The Forth programming language commonly contains an assembler used in CODE words.

Many people use an emulator to debug assembly-language programs.


[edit] Example listing of assembly language source code
Address Label Instruction (AT&T syntax) Object code[14]
.begin
.org 2048
a_start .equ 3000
2048 ld length,%
2064 be done 00000010 10000000 00000000 00000110
2068 addcc %r1,-4,%r1 10000010 10000000 01111111 11111100
2072 addcc %r1,%r2,%r4 10001000 10000000 01000000 00000010
2076 ld %r4,%r5 11001010 00000001 00000000 00000000
2080 ba loop 00010000 10111111 11111111 11111011
2084 addcc %r3,%r5,%r3 10000110 10000000 11000000 00000101
2088 done: jmpl %r15+4,%r0 10000001 11000011 11100000 00000100
2092 length: 20 00000000 00000000 00000000 00010100
2096 address: a_start 00000000 00000000 00001011 10111000
.org a_start
3000 a:

Example of a selection of instructions (for a virtual computer[15]) with the corresponding address in memory where each instruction will be placed. These addresses are not static, see memory management. Accompanying each instruction is the generated (by the assembler) object code that coincides with the virtual computer's architecture (or ISA).


[edit] See also
Little man computer - an educational computer model with a base-10 assembly language
x86 assembly language - the assembly language for common Intel 80x86 microprocessors
Compiler
Disassembler
List of assemblers
Instruction set
Microassembler
MACRO-11

[edit] References
^ David Salomon (1993). Assemblers and Loaders
^ Answers.com. "assembly language: Definition and Much More from Answers.com". http://www.answers.com/topic/assembly-language?cat=technology. Retrieved on 2008-06-19.
^ NESHLA: The High Level, Open Source, 6502 Assembler for the Nintendo Entertainment System
^ Eidolon's Inn : SegaBase Saturn
^ Jim Lawless (2004-05-21). "Speaking with Don French : The Man Behind the French Silk Assembler Tools". http://www.radiks.net/~jimbo/art/int7.htm. Retrieved on 2008-07-25.
^ Randall Hyde. "The Great Debate". http://webster.cs.ucr.edu/Page_TechDocs/GreatDebate/debate1.html. Retrieved on 2008-07-03.
^ "68K Programming in Fargo II". http://tifreakware.net/tutorials/89/a/calc/fargoii.htm. Retrieved on 2008-07-03.
^ Hyde, op. cit., Foreword ("Why would anyone learn this stuff?")
^ "256bytes demos archives". http://web.archive.org/web/20080211025322rn_1/www.256b.com/home.php. Retrieved on 2008-07-03.
^ Stroustrup, Bjarne, The C++ Programming Language, Addison-Wesley, 1986, ISBN 0-201-12078-X: "C++ was primarily designed so that the author and his friends would not have to program in assembler, C, or various modern high-level languages. [use of the term assembler to mean assembly language]"
^ Saxon, James, and Plette, William, Programming the IBM 1401, Prentice-Hall, 1962, LoC 62-20615. [use of the term assembly program]
^ Randall Hyde. "Which Assembler is the Best?". http://webster.cs.ucr.edu/AsmTools/WhichAsm.html. Retrieved on 2007-10-19.
^ "vid". "FASMLIB: Features". http://fasmlib.x86asm.net/features.html. Retrieved on 2007-10-19.
^ Murdocca, Miles J.; Vincent P. Heuring (2000). Principles of Computer Architecture. Prentice-Hall. ISBN 0-201-43664-7.
^ Principles of Computer Architecture (POCA) – ARCTools virtual computer available for download to execute referenced code, accessed August 24, 2005

[edit] Further reading
Michael Singer, PDP-11. Assembler Language Programming and Machine Organization, John Wiley & Sons, NY: 1980.
Peter Norton, John Socha, Peter Norton's Assembly Language Book for the IBM PC, Brady Books, NY: 1986.
Dominic Sweetman: See MIPS Run. Morgan Kaufmann Publishers, 1999. ISBN 1-55860-410-3
John Waldron: Introduction to RISC Assembly Language Programming. Addison Wesley, 1998. ISBN 0-201-39828-1
Jeff Duntemann: Assembly Language Step-by-Step. Wiley, 2000. ISBN 0-471-37523-3
Paul Carter: PC Assembly Language. Free ebook, 2001.
Website
Robert Britton: MIPS Assembly Language Programming. Prentice Hall, 2003. ISBN 0-13-142044-5
Randall Hyde: The Art of Assembly Language. No Starch Press, 2003. ISBN 1-886411-97-2
Draft versions available online as PDF and HTML
Jonathan Bartlett: Programming from the Ground Up. Bartlett Publishing, 2004. ISBN 0-9752838-4-7
Also available online as PDF
ASM Community Book "An online book full of helpful ASM info, tutorials and code examples" by the ASM Community

[edit] External links
This article's external links may not follow Wikipedia's content policies or guidelines. Please improve this article by removing excessive or inappropriate external links.
Look up assembly language in Wiktionary, the free dictionary. Wikibooks has a book on the topic of
Subject:Assembly Language
Randall Hyde's The Art of Assembly Language as HTML and PDF version
Machine language for beginners
Introduction to assembly language
The ASM Community, a programming resource about assembly including a messageboard and an ASM Book
Intel Assembly 80x86 CodeTable (a cheat sheet reference)
Unix Assembly Language Programming
PPR: Learning Assembly Language
An Introduction to Writing 32-bit Applications Using the x86 Assembly Language
Assembly Language Programming Examples
Typed Assembly Language (TAL)
Authoring Windows Applications In Assembly Language
Information on Linux assembly programming
x86 Instruction Set Reference
Terse: Algebraic Assembly Language for x86
Iczelion's Win32 Assembly Tutorial
IBM z/Architecture Principles of Operation IBM manuals on mainframe machine language and internals.
IBM High Level Assembler IBM manuals on mainframe assembler language.
Assembly Optimization Tips by Mark Larson
Mainframe Assembler Forum
NASM Manual
Experiment with Intel x86/x64 operating modes with assembly
Build yourself an assembler (eniAsm project) and various assembly articles and tutorials
Encoding Intel x86/IA-32 Assembler Instructions
The Basics of Assembly Language (Linux)

[edit] Software
MenuetOS - Operating System written entirely in 64-bit assembly language
SB-Assembler for most 8-bit processors/controllers
GNU lightning, a library that generates assembly language code at run-time which is useful for Just-In-Time compilers
WinAsm Studio, The Assembly IDE - Free Downloads, Source Code , a free Assembly IDE, a lot of open source programs to download and a popular Board
The Netwide Assembler
GoAsm - a free component "Go" tools: support 32-bit & 64-bit Windows programming