Computers/Programming Language

ch6-1. Data Type

emzei 2012. 3. 17. 14:15

▷ Intro

★ Goal : PL에서 제공되는 데이터 타입이 현실의 문제 범위에 얼마나 잘 일치할까?!

How well the DATA TYPES provided by a programming language match the REAL WORLD PROBLEM DOMAIN?


★ Primitive data type vs. User-defined data type

★ Concept and Implementation


♠ Data type : a collection of data value and a set of predefined operations on those values.


※ Descriptor : the collection of the attributes of a variable.

- In an implementation, a descriptor is an area of memory that stores the attributes of a variable.

> If the attributes are all static, descriptors are required only at compile-time. These descriptors arebuilt by the compiler, usually as a part of a symbol table, and are used during compilation.

> For dynamic attributes, part or all of the descriptor must be maintained during execution. In this case, the descriptor is used by the run-time system

- In all case, the descriptors are used for type checking and to build the code for the allocation and deallocation operations.



 Primitive data types : not defined in terms of other types

♠ numeric types ( 정수형/ 실수형)

▣ Integer

  - sign bit와 absolute value로 구성

  - 음수표현 : 2의 보수법 , 부호와 절대값에 의한 표현법

  - size는 PL 의 구현방법에 의존


▣ Floating-Point

  - 실제 숫자의 모형(?!)

  - 대부분의 실제 숫자의 추정치. (정확도가 낮다). 

예)) 0.1은 표현할수 없다.

  - float ; exp :8bit / fraction :23bit

  - double ; exp :11bit / fraction :52bit

  - 구현은 하드웨어에 영향을 받을 수 있음.


▣ Decimal

  - 하드웨어가 10진 데이터를 지원할 필요가 있다

  - 10진데이터 타입은 10진수의 각 자리에 고정된 숫자를 저장함. 값에서 10진데이터의 소수점을 고정하면서...

  ◈ Binary Coded Decimal (BCD)

  - good : can store decimal value exactly

  - bad : memory requirement


♠ boolean(logical) types

  - simplest types : false (zero) / true (non-zero)

  - ALGOL60, Pascal : primitive data type

  - C, C++

  - Memory representation : bit/byte, smallest efficiently addressable cell of memory


♠ character types

  - ascii code

  - 0~127 : 128개의 다른 글자들


▷ Character string types : sequence of characters

♠ design issue

- 문자열이 단순히 문자 배열의 종류거나 원시 데이터 타입이어야 할까?

- 문자열은 반드시 static 또는 dynamic한 길이를 가져야할까?


♠ String operation

  - substring reference

  - catenation

  - relational operator : compare 

  - assignment

  - length

  - pattern matching


---< ◈ EXAMPLE >---


 Pascal, C, C++, Ada : string ~ stored in array of single characters (not primitive data type)


 Ada 

  - STRING : a type predefined to be single-dimensioned arrays of CHARACTER elements

  - substring reference : NAME1(2:4)

  - catenation : &  ex. NAME1:=NAME1&NAME2;


 C, C++

  - char arrays to store character strings

  - standard library : string.h

  - character strings are terminated with a special character, null

> easy to maintain the length of the sting variable

  - string handling functions : strcpy, strcat, strcmp, strlen...


 FORTRAN77, FORTRAN90, BASIC

  - string as primitive data type

  - assignment, relational operator, catenation, substring


 Java 

  - String class, StringBuffer class


 SNOBOL4

  - for pattern matching


♠ String Length Option

♤ Static length string

  - static length is specified in declaration part

  - FORTRAN77/90, COBOL, Pascal, Ada

  - e.g., FORTRAN90 ~ CHARACTER(LEN=15) NAME1, NAME2


♤ Limited dynamic length strings  : 사용자가 쓰기엔 편함

  - the length is varying up to a declared and fixed maximum set by the variable definition

  - e.g., C,C++


♤ Dynamic length string

  - varying length with no maximum

  - e.g., SNOBOL4, JavaScript, Perl


----- < ◈ implementation ◈ > -----

◈ Static length string

  - need compile time description

  - length, address

◈ Limited dynamic string

  - need run-time description

  - maximum length, current length, address

◈ Dynamic length string

  - need run-time description

  - current length

  - implementation methods : linked list, adjacent storage cells


 User-defined ordinal types

 ♠ ordinal type : the range of possible values can be easily associated with the set of positive integers


♠ Enumeration type

  - All of the possible values are enumerated in the definition

  - example

C : typedef enum {mon, tue, wed, ..., sun} DAYS;

Ada : type DAYS is (mon, tue, wed, ..., sun)


  - design issue

> Is a literal constant allowed to appear in more than one type definition?

(overloaded literals)

> Are enumeration values coerced to integer?

(Ada is)


  - Operators

> predecessor, successor (pred, succ) :위치!

position in the list of values 

☞ ordinal values

> value for a given position number

> compare..


----- < ◈ Operators example ◈ > -----

◈ Pascal

  - a literal constant is not allowed to be used in more than one enumeration type

  - can be used as

> array subscriptor, for loop variables, case selector...

◈ C, C++

  - the same literal cannot appear.......

  - enumeration objects are converted to integer

◈ Ada

  - similar to those of Pascal except the literals are allowed to appear in more than declarations


  - Evaluation of Enumeration Type

◐ Advantage : readability, reliability

◐ Problem

  - memory representation

  - type checking problem

◐ Time/Space trade-off

◐ Java not include enum tyoes


♠ Subrange types

  - a contiguous subsequence of an ordinal type 

> no design issue

> e.g., 12 . . 14

 

----- < ◈ example ◈ > -----

◈ Pascal ~ standard

  type

    uppercase = 'A' .. 'Z';

    index = 1 . . 100;

 ◈ Ada

   subtype WEEKDAYS is DAYS range Mon . .Fri;


  - Operations

♤ All of the operations defined for the parent type are also defined for the subtype except assignment of values outside the specified range


♤ Evaluation : readability enhance


♤ Implementation

  - enumeration : association with no-negative integer

  - subrange : exactly the same way as parent type except range check