Lecture Four

Data Representation; Number Systems; Permissions

Data Representation

why study data representation?
- most computers today are digital
- most digital computers are binary
- thus, most computers understand only binary - combinations of one's and zero's
- any information stored in a computer, for example in memory or on disk or on CD-ROM, must be binary
- for many technical aspects of programming, the details of data representation must be understood:
  - C programming - sending information over networks correctly
  - Unix - setting access permissions for files and directories
  - web page programming - setting colours for backgrounds, text, and links
  - assembler or machine language programming
important numbering systems
- base 10 (decimal) is the numbering system used by most people
- base 2 (binary) is used by most computers internally
- base 8 (octal) and base 16 (hexadecimal) are used by programmers to more compactly represent binary numbers
in this lecture, we will see how numbers such as 237 and -84 are represented in computers

Number Systems

Base 10 - decimal numbering system

numbering system used by most people
in base 10, there are 10 numeric symbols, 0 to 9
when we write a number, such as 3572, the digits have different values depending on their position
- 2 simply represents the number 2, because it is at the right-most end of the number
- 7 represents the number 70
- 5 represents the number 500
- 3 represents the number 3000
- formally, because this is base 10, each digit is multiplied by the appropriate power of 10
```
  3 x 10³ = 3 x 1000 = 3000
  5 x 10² = 5 x  100 =  500
  7 x 10¹ = 7 x   10 =   70
  2 x 10⁰ = 2 x    1 =    2
```
- adding 3000 + 500 + 70 + 2 gives us 3572

Base 2 - binary numbering system

numbering system used by most computers
in base 2, there are 2 numeric synbols, 0 and 1
these symbols are called "bits", which is a contraction of binary digits
similar to the decimal system, in a binary number such as 10100100, the digits have different values depending on their position
- this make it easy to convert binary numbers to decimal
- because this is base 2, each digit is multiplied by the appropriate power of 2:
```
  1 x 2⁷ = 1 x 128 = 128
  0 x 2⁶ = 0 x  64 =   0
  1 x 2⁵ = 1 x  32 =  32
  0 x 2⁴ = 0 x  16 =   0
  0 x 2³ = 0 x   8 =   0
  1 x 2² = 1 x   4 =   4
  0 x 2¹ = 0 x   2 =   0
  0 x 2⁰ = 0 x   1 =   0
```
- adding 128 + 0 + 32 + 0 + 0 + 4 + 0 + 0 gives us 164 in decimal
- thus 164₁₀ is equivalent to 10100100₂
to convert decimal numbers, for example 93, to binary:
- there are formal approaches, but the following simpler method works fine for all but very large numbers
- the least significant (right-most) digit in any numbering system has a positional value of 1
- moving towards the left, list the binary values by multiplying the value to the immediate right by 2
- for example, with 8 bits, this list of numbers is 128 64 32 16 8 4 2 1
- find the largest binary positional value that is equal to or smaller than 93
- this would be 64, so we know there is a "1" in the "64 position" 128 64 32 16 8 4 2 1 1
- subtract 64 from 93, giving 29
- find the largest binary positional value that is equal to or smaller than 29
- this would be 16, so we know there is a "1" in the "16 position" 128 64 32 16 8 4 2 1 1 1
- subtract 16 from 29, giving 13
- find the largest binary positional value that is equal to or smaller than 13
- this would be 8, so we know there is a "1" in the "8 position" 128 64 32 16 8 4 2 1 1 1 1
- subtract 8 from 13, giving 5
- find the largest binary positional value that is equal to or smaller than 5
- this would be 4, so we know there is a "1" in the "4 position" 128 64 32 16 8 4 2 1 1 1 1 1
- subtract 4 from 5, giving 1
- it might be obvious by now that there is a "1" in the "1 position", and we have finished this part of the process 128 64 32 16 8 4 2 1 1 1 1 1 1
- now place a "0" in all the remaining positions 128 64 32 16 8 4 2 1 0 1 0 1 1 1 0 1
- thus 01011101₂ is the 8-bit binary equivalent to 93₁₀

Base 8 - octal numbering system

numbering system often used by programmers to more easily represent binary numbers
in base 8, there are 8 numeric synbols, 0 to 7
similar to the decimal system, in an octal number such as 264, the digits have different values depending on their position
- this make it easy to convert octal numbers to decimal
- because this is base 8, each digit is multiplied by the appropriate power of 8:
```
  2 x 8² = 2 x 64 = 128
  6 x 8¹ = 6 x  8 =  48
  4 x 8⁰ = 4 x  1 =   4
```
- adding 128 + 48 + 4 gives us 180 in decimal
- thus 180₁₀ is equivalent to 264₈
the real power of octal is its use in representing binary numbers

this is because one octal digit can exactly represent 3 bits:

  0 0 0  in binary is  0  in octal
  0 0 1                1
  0 1 0                2
  0 1 1                3
  1 0 0                4
  1 0 1                5
  1 1 0                6
  1 1 1                7

to convert a binary number, such as 11101010, to octal:
- split up the binary number into groups of 3 bits, starting from the right side 1 1 / 1 0 1 / 0 1 0
- if there are fewer then 3 bits in the left-most grouping, you may ignore this or add leading zeroes 0 1 1 / 1 0 1 / 0 1 0
- leading zeroes don't change the value of a number, in any numbering system
- now simply convert each 3-bit grouping to the corresonding octal digit 0 1 1 / 1 0 1 / 0 1 0 3 5 2
- thus 352₈ is equivalent to 11101010₂
to convert an octal number, such as 760, to binary:
- reverse the above process, by replacing each octal digit with it's 3-bit equivalent 7 6 0 1 1 1 / 1 1 0 / 0 0 0
- thus 111110000₂ is equivalent to 760₈
example of the use of octal numbers:
- to set permission on a file, a command similar to "chmod 751 filename" might be used
- the digits for the chmod command are octal, we will look at the details next week

Base 16 - hexadecimal numbering system (hex)

numbering system often used by programmers to more easily represent binary numbers
in base 16, there are 16 numeric synbols, 0 1 2 3 4 5 6 7 8 9 A B C D E F
similar to the decimal system, in a hex number such as A1C, the digits have different values depending on their position
- this make it easy to convert hex numbers to decimal
- because this is base 16, each digit is multiplied by the appropriate power of 16:
```
  A x 16² = 10 x 256 = 2560
  1 x 16¹ =  1 x 16 =    16
  C x 16⁰ = 12 x  1 =    12
```
- adding 2560 + 16 + 12 gives us 2588 in decimal
- thus 2588₁₀ is equivalent to A1C₁₆
the real power of hexadecimal is its use in representing binary numbers

this is because one hexadecimal digit can exactly represent 4 bits:

  0 0 0 0  in binary is  0  in hexadecimal
  0 0 0 1                1
  0 0 1 0                2
  0 0 1 1                3
  0 1 0 0                4
  0 1 0 1                5
  0 1 1 0                6
  0 1 1 1                7
  1 0 0 0                8
  1 0 0 1                9
  1 0 1 0  in binary is  A  in hexadecimal and  10  in decimal
  1 0 1 1                B                      11
  1 1 0 0                C                      12
  1 1 0 1                D                      13
  1 1 1 0                E                      14
  1 1 1 1                F                      15

to convert a binary number, such as 111110000, to hexadecimal:
- split up the binary number into groups of 4 bits, starting from the right side 1 / 1 1 1 1 / 0 0 0 0
- if there are fewer then 4 bits in the left-most grouping, you may ignore this or add leading zeroes 0 0 0 1 / 1 1 1 1 / 0 0 0 0
- leading zeroes don't change the value of a number, in any numbering system
- now simply convert each 4-bit grouping to the corresonding hex digit 0 0 0 1 / 1 1 1 1 / 0 0 0 0 1 F 0
- thus 1F0₁₆ is equivalent to 111110000₂
to convert a hex number, such as 3C9, to binary:
- reverse the above process, by replacing each octal digit with it's 3-bit equivalent 3 C 9 0 0 1 1 / 1 1 0 0 / 1 0 0 1
- thus 1111001001₂ is equivalent to 3C9₁₆
example of the use of hex numbers:
- this html tag sets the colour for background on a web page
- the digits are hex, in pairs, representing red, blue, and green intensities

Representing Signed Numbers in Binary

computers use a technique called "two's complement" to handle signed numbers, that is numbers that can be positive or negative
to represent a negative number, such as -93:
- find the binary number for the positive number 93 (we did this above) 1 0 1 1 1 0 1
- add at least one leading zero 0 1 0 1 1 1 0 1
- invert all the bits, by turning 1's into 0's and 0's into 1's 1 0 1 0 0 0 1 0
- now add 1 to the result 1 0 1 0 0 0 1 0 + 1 1 0 1 0 0 0 1 1
- thus 10100011₂ is the 8-bit signed binary number equivalent to -93₁₀
- note that this also the 8-bit unsigned binary number equivalent to 163₁₀
- you MUST know the representation being used in order to be able to conver properly
to convert a negative binary number to it's positive value, the technigue is identical - invert the bits and add one
- to find the number that 10100011 represents as a signed number: 1 0 1 0 0 0 1 1
- invert all the bits, by turning 1's into 0's and 0's into 1's 0 1 0 1 1 1 0 0
- now add 1 to the result 0 1 0 1 1 1 0 0 + 1 0 1 0 1 1 1 0 1
- thus, we get back to 1011101

Permissions

permissions can only be changed by file owner or superuser (system administrator)
chmod is used to alter access permissions to an existing file or directory
the 9 permission bits displayed in an ls -al listing are read/write/execute for user(owner)/group/other
chmod xxx filename
- xxx is 3 octal digits representing the binary string rwxrwxrwx where the first three characters are read/write/execute permission for the user, the next three for the user's group, and the last three for all others
- eg. chmod 640 file1 - would give the user read and write permission, everyone in his group would have read permission, and all others would have no permission
- this is called the "octal" or "absolute" method of changing permissions
chmod u+r filename
- u represents user, could also be g for group, o for other, a for all
- + represents addition of permission, could also be - for removal, = for set
- r represents read permission, could also be w for write, x for execute
- eg. chmod u+x file1 - would give the user execute permission in addition to whatever he had before
- eg. chmod g-w file1 - would take away write permission from the user's group if they had it before
- eg. chmod o=r file1 - would set all others' permission to read only regardless of what they had before
- this is called the "symbolic" or "relative" method of changing permissions

Directory Permissions

r permission for a directory allows viewing of file names in the directory, but no access to the files themselves (regardless of the files' permission settings)
x gives passthrough permission for a directory, which allows access to any files in the directory which have appropriate permissions set, but doesn't allow viewing of file names in the directory
r and x permissions allow viewing of file names, and access to any files which have appropriate permissions set
w and x permissions allow adding or removing of files, but don't allow viewing of file names
r and w and x permissions allow viewing of file names, access to any files which have appropriate permissions set, and adding and removing of files

umask

umask defines default permissions for newly created files, doesn't change permissions on existing files
- default permissions will be 777 minus umask for directories, remove any remaining executes for files
- eg. umask - by itself, shows current umask setting
- eg. umask 077 - new directories will be 700, new files will be 600
- eg. umask 023 - new directories will be 754, new files will be 644