CS412 COMPILER CONSTRUCTION Lecture 1: Introduction ASST. PROF. SYED FAISAL ALI [email protected]
OBJECTIVES Students will understand the phases of the compilation process and be able to describe the purpose and implementation approach of each phase. Give students practical exposure to aspects of theoretical computer science including Languages, Grammars, and Machines. Exercise and reinforce prior programming knowledge with a non-trivial programming project to construct a compiler. Lexical analysis, parsing, and intermediate code-generation will be completed.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 2
WHY STUDY COMPILERS ? 3• Build a large, ambitious software system.• See theory come to life.• Learn how to build programming languages.• Learn how programming languages work.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
COURSE OUTLINE AT A GLANCEIntroduction • Foundation of Compiler Construction, Phases of CompilerSimple Syntax Directed Translator • Grammars, Ambiguity, Parse Tree, Associativity, Precedence,Trees Lexical Analysis • Role of Lexical Analyzer, Buffering, Errors,Tokens,Automata Syntax AnalysisIntermediate Code Generation • Role of Parsers, CFG’s,Writing Grammars, Eliminating Ambiguities,Type of Run Time Environment Parsers, Detail Parsing Techniques and Methods, (TDP, BUP) Code Generations Optimization • DAG’s,Three Address Code,Types and Declarations,Type Checking, Expressions, Control Flow, Backpatching, Switch Statements • Storage Organizations, Stack Allocations of Space, Heap Management, Garbage Collections,Trace-base Collections • Issues, Input in Code Generations, Target Programs,Addresses in Target Code, Basic Blocks and Flow Graphs, Simple Code Generators •Optimization Techniques, Principle Sources of Optimizations, Foundation of Data Flow Analysis, Loops in Flow Graphs, Region-based Analysis, Symbolic Analysis© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 4
TEXTBOOK 5 Compilers: Principles,Techniques, and Tools,Aho, Sethi and Ullman , 2nd Edition http://dragonbook.stanford.edu/ Andrew W.Appel, Modern compiler implementation in Java (Second edition), Cambridge University Press, New York, NY, USA, 2002, with Jens Palsberg.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
OTHER RECOMMENDED SOURCES 6 Parsing Techniques, Grune and Jacobs http://www.cs.vu.nl/~dick/PT2Ed.html Advanced Compiler Design and Implementation, Muchnik© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
HIGH-LEVEL LANGUAGESJohn Backus, Rear Admiral Graceteam lead on Hopper, inventor ofFORTRAN. A-0, COBOL, and the term “compiler.”© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 7
HISTORY OF COMPILERS 8 First complete compiler (1957) for FORTRAN – John Backus Early compilers – Assembly language Self Hosting compiler LISP (1962) PASCAL / C Bootstrapping© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
INTERPRETER Interpreters Used in educational / software development scenarios Easier to write than back ends Interpreters are normally written in a HLL Allows better error checking and reporting© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 9
INTERPRETER An interpreter reads the source code one instruction or line at a time, converts this line into machine code and executes it. The machine code is then discarded and the next line is read. Can interrupt; change the program and either continue or start again.Disadvantages Examples of interpreters are BASIC, script interpreters such as JavaScript, and languages such as Lisp.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 10
INTERPRETERS (1)Interpreter Output Source Program Encoding Input Data Machine-independent Significant overhead© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER 11CONSTRUCTION
COMPILERA program which translates a program written in one language to an equivalent program in other language.Source Compiler TargetProgram Program Error 12 Messages© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
COMPILER Intermediate Code (IR)Source Code Front end Back end Executable Code (Analysis) (Synthesis) Instead of producing an intermediate form the code is executed directly Interpreting backend© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 13
TYPES OF COMPILERS One pass Compiler Driver Quick ; PASCAL Syntactic Multi-pass Analyzer Step by step Compiler Contextual High performance; slower; lesser memory Driver Analyzer Source to Source Compiler Syntactic Contextual Code Code Analyzer Analyzer Generator Generator Multi-pass One pass© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 14
TYPES OF COMPILERS Stage Compiler Byte code compiler JIT Compiler Byte code is compiled to native machine code prior to execution. The JIT compiler compiles the bytecodes of that method into native machine code, compiling it \"just in time\" to run. When a method has been compiled, the JVM calls the compiled code of that method directly instead of interpreting it. Decompiler A decompiler is a computer program that performs the reverse operation to that of a compiler.C Front End Middle Back End ARMC++ End SpareAda x86_32Java Power PC© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 15
TYPES OF COMPILERS Trans compilers / Cascaders: sometimes to an intermediate language that still needs further processing; these are known as transcompilers (or sometimes as cascaders). Cross Compilers: A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is running. For example, a compiler that runs on a Windows 7 PC but generates code that runs on Android smartphone is a cross compiler. Embedded Systems Narrow vs Broad Compilers : A narrow compiler reads a small part of the program, typically a few tokens, processes the information obtained, produces a few byte of object code if appropriate, discards the most of the information about these tokens, and repeats this process until the end of the program text is reached. Whereas, broad compiler reads the entire program and applies a series of transformations to it and reached the desired object code.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 16
PROPERTIES OF A GOOD COMPILER Reliability FaithfulnessGood Human Compilation Error Handling Interface Speed Implementation 17 & Maintenance© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
PROPERTIES OF GENERATED CODE Reliable Fast SecureSmall Faithful© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 18
PARTS OF THE COMPILATION Analysis SynthesisAnalysis: Breaks up the source program into constituent pieces and creates IR Tools: Structure Editors, Pretty Printers, Static Checker, InterpreterSynthesis: Construct the desired target program from IR.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 19
ANALYSIS OF THE SOURCE PROGRAM Linear Analysis: Lexical Analysis (or) Scanning It is a process of reading the characters from left to right and grouped into tokens having a collective meaning. Hierarchical Analysis: Syntax Analysis (or) Parsing It involves the grouping of tokens of the source program into grammatical phases that are used by the compiler to synthesize output. Semantic Analysis: Checks the source program for semantic errors and gathers type information for the subsequent code generation phase.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 20
PHASES OF THE COMPILER Lexical analysis (Scanning): Identify logical pieces of the description. Syntax analysis (Parsing): Identify how those pieces relate to each other. Semantic analysis: Identify the meaning of the overall structure. IR Generation: Design one possible structure. IR Optimization: Simplify the intended structure. Code Generation: Fabricate the structure. Code Optimization: Improve the resulting structure.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 21
PHASES OF COMPILER VISUALIZATION Source Language Lexical AnalysisSymbol Table Syntax Analysis Error Handler Manager Semantic Analysis Intermediate Code 22 Code Optimization Code Generation Target Language© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
THE STRUCTURE OF A MODERN COMPILERSource Code Lexical Analysis Front End Syntax Analysis Middle End Semantic Analysis Machine Back End IR Generation Code IR Optimization Code Generation Code Optimization© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 23
THE STRUCTURE OF A MODERN COMPILERSource Code Lexical Analysis Front End Syntax Analysis Back End Semantic Analysis Machine Code IR Generation IR Optimization Code Generation Code Optimization© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 24
OVERVIEW AND HISTORY (1) Cause Software for early computers was written in assembly language The benefits of reusing software on different CPUs started to become significantly greater than the cost of writing a compiler The first real compiler FORTRAN compilers of the late 1950s 18 person-years to build© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 25
OVERVIEW AND HISTORY (2) Compiler technology is more broadly applicable and has been employed in rather unexpected areas. Text-formatting languages, like nroff and troff; preprocessor packages like eqn, tbl, pic Silicon compiler for the creation of VLSI circuits Command languages of OS Query languages of Database systems© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 26
WHAT DO COMPILERS DO (1) A compiler acts as a translator, transforming human-oriented programming languages into computer-oriented machine languages. Ignore machine-dependent details for programmerProgramming Compiler MachineLanguage Language(Source) (Target)© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 27
WHAT DO COMPILERS DO (2) Compilers may generate three types of code: Pure Machine Code Machine instruction set without assuming the existence of any operating system or library. Mostly being OS or embedded applications. Augmented Machine Code Code with OS routines and runtime support routines. More often Virtual Machine Code Virtual instructions, can be run on any architecture with a virtual machine interpreter or a just-in- time compiler Ex. Java© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 28
WHAT DO COMPILERS DO (3) Another way that compilers differ from one another is in the format of the target machine code they generate: Assembly or other source format Re-locatable binary Relative address A linkage step is required Absolute binary Absolute address Can be executed directly© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 29
THE STRUCTURE OF A COMPILER (1) Any compiler must perform two major tasks CompilerAnalysis Synthesis Analysis of the source program 30 Synthesis of a machine-language program© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
THE STRUCTURE OF A COMPILER (2) Source Tokens Syntactic Semantic Program Structure Routines Scanner Parser(Character Stream) Intermediate Representation Symbol and Optimizer Attribute Tables Code 31 Generator (Used by all Phases of The Compiler)© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION Target machine code
THE STRUCTURE OF A COMPILER (3) Source Tokens Syntactic Semantic Program Structure Routines Scanner Parser(Character Stream)Scanner Intermediate Representation The scanner begins the analysis of the source program by Optimizercrehaadraincgtetrhseinintopuint,dicvhiSdayuArmaattlcrbtwiobeolur artbdenysdcahnadrascytmerb,oalsn(dtogkroeunpsin) g Code RE ( Regular expression ) Tables Generator 32 NFA ( Non-deterministic Finite Automata ) DFA ( Deterministic Finite A(Uutsomedatab)y all LEX Phases of The Compiler)© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILERCONSTRUCTION © ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION Target machine code
THE STRUCTURE OF A COMPILER (4) Source Tokens Syntactic Semantic Program Structure Routines Scanner Parser (Character Stream)Parser Intermediate Representation Given a formal syntax specification (typically as a context- Optimizer Abtfhreeseienmsgygnirunatstamoecmdtuic.naisrtst[rCuaFcstGusrp]See),yAcisTmtithftaireerbebidobcpleluoabastgryensndteihzeeredpa,rdtohsdeutocpktaieornsnsesraoenfidtthhgeerroCcuFaplGsls Code corresponding semantic routines directly or builds a syntax Generator 33 (Used by all Target machine code tree. Context-Free GrammPahra) ses of CFG ( BNF ( Backus-Naur Form )The Compiler) GC©AOAANSSS(TT.RGPURrCOaTFIm.OSYNmEDaFrAIASAnL aALlyI sSiFsAISAALlgALoIUrIiTth@mGMsAI)L.COM CS412-COMPILER LL, LR, SLR, LALR Parsers YACC © ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
THE STRUCTURE OF A COMPILER (5) Source Tokens Syntactic Semantic Program Structure Routines Scanner Parser (Character Stream)Semantic Routines Intermediate Representation Perform two functions Check the static semantics of each construct Optimizer Do the actual traSnysmlabtioolnand Code The heart of a compileArttribute Generator 34 Tables Syntax Directed Translation SIRem(Inatnetrimc PedroiacteesRsienpgreTseecnht(PnaUithqiosuanees)sdesbyofall The Compiler) © ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION © ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION Target machine code
THE STRUCTURE OF A COMPILER (6) Source Tokens Syntactic Semantic Program Structure Routines Scanner Parser (Character Stream)Optimizer Intermediate Representation The IR code generated by the semantic routines is Optimizer TPiamnhepaiesrlpyophzvheoealddesaeIoRnpcdtcaimotnrdaibzeneastfviooSenrrymAyTmtectarbdobiobmlienlupasttoelnedfxunacntdiosnlaolwly equivalent but Code Generator 35 loop optimization, reg(Uissteerdabllyocaalltion, code scheduling Register and Temporary MPanhaagsemesenot f Peephole Optimization The Compiler) © ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION © ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION Target machine code
THE STRUCTURE OF A COMPILER (7) Source Tokens Syntactic Semantic Program Structure Routines Scanner Parser(Character Stream) Intermediate RepresentationCode Generator Optimizer Interpretive Code Generation Generating Code from Tree/Dag Grammar-Based Code Generator© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CodeCONSTRUCTION Generator 36© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION Target machine code
THE STRUCTURE OF A COMPILER Scanner Code Generator [Lexical Analyzer] [Intermediate Code Generator]Tokens Non-optimized Intermediate Code Parser [Syntax Analyzer] Code Optimizer Optimized Intermediate CodeParse tree Semantic Process Code Optimizer [Semantic analyzer] Target machine codeAbstract Syntax Tree w/ Attributes 37 © ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
PHASES OF COMPILER WITH SUPPORTING PARTS© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 38
THE STRUCTURE OF A COMPILER (9) Compiler writing tools Compiler generators or compiler-compilers E.g. scanner and parser generators Examples : Yacc, Lex© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 39
HOW DOES YACC WORKS?© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER 40CONSTRUCTION
THE SYNTAX AND SEMANTICS OF PROGRAMMING LANGUAGE (1) A programming language must include the specification of syntax (structure) and semantics (meaning). Syntax typically means the context-free syntax because of the almost universal use of context-free-grammar (CFGs) Ex. a = b + c is syntactically legal b + c = a is illegal© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 41
THE SYNTAX AND SEMANTICS OF PROGRAMMING LANGUAGE (2) The semantics of a programming language are commonly divided into two classes: Static semantics Semantics rules that can be checked at compiled time. Ex. The type and number of a function’s arguments Runtime semantics Semantics rules that can be checked only at run time© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER 42CONSTRUCTION
COMPILER DESIGN AND PROGRAMMING LANGUAGE DESIGN (1) An interesting aspect is how programming language design and compiler design influence one another. Programming languages that are easy to compile have many advantages.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 43
COMPILER DESIGN AND PROGRAMMINGLANGUAGE DESIGN(2) Languages such as Snobol and APL (Address Programming Language) are usually considered noncompilable – SPITBOL (a compliable version of SNOBOL)What attributes must be found in a programming language to allowcompilation? Can the scope and binding of each identifier reference be determined before execution begins? Can the type of object be determined before execution begins? Can existing program text be changed or added to during execution?© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 44
COMPUTER ARCHITECTURE ANDCOMPILER DESIGN Compilers should exploit the hardware-specific feature and computing capability to optimize code. The problems encountered in modern computing platforms: Instruction sets for some popular architectures are highly non-uniform. High-level programming language operations are not always easy to support. Ex. exceptions, threads, dynamic heap access … Exploiting architectural features such as cache, distributed processors and memory Effective use of a large number of processors© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 45
COMPILER DESIGN CONSIDERATIONS Debugging Compilers Designed to aid in the development and debugging of programs. Optimizing Compilers Designed to produce efficient target code Re-targetable Compilers A compiler whose target architecture can be changed without its machine-independent components having to be rewritten.© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION 46
END OF LECTURE 1 47© ASST. PROF. SYED FAISAL ALI [email protected] CS412-COMPILER CONSTRUCTION
Search
Read the Text Version
- 1 - 47
Pages: