Draw a DFA for a lexical analyzer

Assignment Help Computer Engineering
Reference no: EM132706803

Assignment 1 - Compiler Design

1 Regular Expressions

1. Following the POSIX ERE syntax, give a regular expression that matches the date and temperature pairs reasonably similar to the following positive examples. Your expression must not match any of the negative examples.

Positive examples         Negative examples

01-12-2000 0K              1-9-2019 -4K

01-09-2019 40.2?C        12-09-2017 22.?C

28-03-2019 50?F           28-13-2019 50 ?F

01-12-2530 -2.33?C      01-12-2017 -5

Notice the following:
• There should be no whitespace before the temperature unit (which can be 'K', '°C' or '°F').
• A Kelvin temperature cannot be negative.
• There should be some whitespace(s) between the date and temperature.
• The day field must have a numerical value between 01 and 31 and the month field between 01 and 12. Years always have four digits.
• A decimal number should have digits before and after the decimal point. Decimal numbers can have at most two digits to the right of the decimal point.

2. Give short answers: Your regular expression may also match some strings that do not correspond to actual dates on a calendar. What additional checks are needed to filter out those?

Lexical Analysis

Function definitions in Erlang, which is a concurrent functional programming language, look like this:

decrease_to_zero(Argument1) ->
case Argument1 > 0 of
true ->
LocalVariable = Argument1 - 1, decrease_to_zero(LocalVariable);
false ->
<<"zero?", Argument1>>
end.

Most elements here are similar to those in most programming languages. More specifically this code snippet contains integers (0, and 1), strings ("zero?"), arithmetic operators (-), comparison operators (>), other operators (like = which in Erlang denotes pattern matching), keywords of the language (case, of, and end), and other miscellaneous elements of the language ((, ), ,, ->, <<, and >>)2. In addition, this code contains Erlang variables (Argument1 and LocalVariable) and atoms (decrease_to_zero, true, and false).
In Erlang, variables begin with a capital letter and atoms begin with a lowercase letter. Both categories can subsequently contain letters, digits, and underscores.

Erlang's own lexical analyzer returns the following list of tokens for this code. For tokens that only have one element in the corresponding category, we only show the token itself; tokens that are of general categories are shown as pairs where the first element shows the token category and the second element shows the lexeme.

[{id, decrease_to_zero}, '(', {var, Argument1}, ')', '->',
case, {var, Argument1}, '>', {integer, 0}, of,
{boolean, true}, '->',
{var, LocalVariable}, '=', {var, Argument1}, '-', {integer, 1}, ',',
{id, decrease_to_zero}, '(', {var, LocalVariable}, ')', ';',
{boolean, false}, '->',
'<<', {string, "zero?"}, ',', {var, Argument1}, '>>',
end, '.']

Tasks:
1. Write regular expressions that can match Erlang's:
(a) integers
(b) atoms
(c) variables
(d) strings; it should be possible for a string to contain quote characters; use '\' as an escape character4
2. Draw a DFA (or NFA)5 for a lexical analyzer that can recognize all the token categories present in the excerpt. Mark each accepting state with the label of the corresponding token. Your analyzer must not recognize more syntactic elements of the language (e.g., other comparison operators) than those necessary for scanning the code above but should be able to recognize different programs with the same elements.
3. Describe in detail (i.e. including all transitions from the initial state until a token or an error is emitted) how the following strings will be split into tokens by the lexical analyzer you defined in part 2.
(a) >>>=>>>->
(b) >>>->>-
4. What tokens (or errors) will be returned by your lexical analyzer for the following inputs?
(a) case factorial(Input) of 120 -> "Input was 5"; Other -> Other end
(b) OneOfTheseIsATrap = <<"be careful">> < <<65>>.
(c) OneOfTheseIsATrap = <<"be careful">> <<<65>>.
5. Give short answers:
(a) Lexical analyzers often treat consecutive whitespace characters as a single token, or ignore them altogether (emitting no tokens). Give examples where such behaviours are undesirable (one where a lexical analyzer should not ignore whitespace characters and one where it should additionally return separate tokens for each).
(b) Lexical analyzers normally ignore (do not emit tokens for) "comments". Give an example where such behaviour is undesirable.

Assignment 2

1 Parsing and Semantic Actions

Dice notation is a system to represent di?erent combi- nations of dice in role-playing games using simple algebra-like notation. The syntax of dice notation follows the following grammar, where all symbols in typewritter font (+, x, d, and int) are terminal symbols:
Rolls → Roll | Sum
Sum → Sum1 + Rolls | Roll
Roll → Dice | int | Dice x int
Dice → int1 d int2 | d int
A sample string of the language is 42 + 4d7x3 + d20
1. Is the grammar ambiguous? If it is, give an example. Otherwise, justify why it is not.
2. Give a leftmost derivation for the sample string.
3. Draw a parse tree for the sample string.

4. Write semantic actions to calculate the result of a dice roll expressed in dice notation. You may assume that in your actions you can also call the function random(n) which returns an integer between 1 and n (inclusive). You can execute arbitrary code in semantic actions (e.g., a for loop might come in handy), as long as you also assign some value to the left-hand side symbol. The final result should be stored as the value of the start symbol (i.e., Rolls.val).
5. Label/number your semantic actions. Describe an evaluation of the sample input, using the tree from question 3 and mentioning the order in which the actions are executed (use arbitrary random values for the results of calls to random).

2 LL(1) Grammars

In the following context-free grammar, which describes expressions in a simplified functional language, the symbols +, (, ), ->, and <id> are terminals and S is the initial symbol.
S → F + S | <id> -> S | F F → A V
A → F | ?
V → <id> | ( S )

1. Explain briefly why this grammar cannot be directly parsed by an LL(1) parser.

2. Convert this grammar to an equivalent one that has a better chance to be LL(1).

3. Write the First and Follow sets for all non-terminal symbols of the new grammar.

4. Construct the complete LL(1) parsing table for the new grammar.

5. Consider the tokens x and f to belong to the <id> token class. Show all the steps required to parse the input string: f(x -> x + f f x).

Assignment 3

1 LR(1) Grammars
Consider the following grammar, where {a,b,c,d,e} are terminal symbols and S is the initial symbol.
S → A a (1)
S → e (2)
A → d B (3)

A → b B S (4)

B → c B (5)
B → ? (6)

1. Write the First sets for all non-terminal symbols of the grammar.

2. Construct the full LR(1) DFA, showing all items in each state.

3. Construct the LR(1) parsing table using the DFA. For the reduce actions, use the provided enumeration of the productions in the grammar.

4. Show all steps required to parse the following string: bbcceaa

5. Is this grammar LALR(1)?

2 Activation Records

In the following listing we show an excerpt of a program in a language that allows nested definitions for functions and uses call-by-value:int collatz_length(int n, int len) {

boolean collatz_base() { return n == 34;
}

if (collatz_base()) return len + 13;
else
if ((n mod 2) == 1)
return collatz_length(3 * n + 1, len + 1); else
return collatz_length(n div 2, len + 1);
}

Suppose we have in our main() function a call to collatz length(7, 0). Assume that the compiler allocates all the variables (i.e. both named and temporary) on the stack.
Questions
1. What will the return value of this call be?
2. What is the maximum number of activation records that will ever be in the stack before the call from
main returns? (Assume that the compiler does not optimize tail calls.)

3. Give a diagram of the state of the stack just before collatz base returns true. Your diagram should contain at least the following information:
(a) Boundaries of the activation records.
(b) Location and content of the actual parameters for the calls to the functions.
(c) Where in the stack is the value of n, for which collatz base returns true.
(d) Location of the return values.
(e) Location of the return addresses.
(f) Location and content of the control links and access links.
(g) Optional: Location and content of temporary variables.
4. How should the value of n be retrieved, in order to make the comparison in collatz base?

5. Optional: Describe briefly the tail call (or tail recursion) optimization and whether it can be applied to this code snippet. You can read about it in the Wikipedia.

Attachment:- CSE_assignments.rar

Reference no: EM132706803

Questions Cloud

Make a budget showing quantity of solvent to purchased : Make a budget showing the quantity of solvent Q80 to be purchased for July, August, and September, and for the quarter in total.
Explain the quantitative analysis of options : Using your selected BA plan template, detail the solution you have selected and the criteria and evaluation instruments used. This document is an interim step.
Which of the statements regarding variable cost is true : Which of the statements regarding variable cost is true? Variable cost increases on a per unit basis as the number of units produced increases.
Exposing the various aspects of political process : What specific roles do both media forums that you chose have in exposing the various aspects of a political process?
Draw a DFA for a lexical analyzer : Lexical analyzer that can recognize all the token categories present in the excerpt. Mark each accepting state with the label of the corresponding token
Determine how much should operating income increase : Assume that the operating results for last year, The president expects sales to increase by 12% next year. By how much should operating income increase?
Assess the risk of material misstatement at assertion level : The bonds paid 7% interest every December 31, For each of the accounting issues, assess the risk of material misstatement at the assertion level
How does a high employee turnover rate impact : Employee turnover can be very costly to an organization in terms of lost productivity as well as added recruitment costs. While some turnover is to be expected.
Pros and cons on both sides of the debate about laws : Summarize recent developments in several states enacting voter ID laws. Explain the pros and cons on both sides of the debate about these laws.

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd