Lecture Four

Data Representation; Number Systems; Permissions

Data Representation

  • why study data representation?
    • most computers today are digital
    • most digital computers are binary
    • thus, most computers understand only binary - combinations of one's and zero's
    • any information stored in a computer, for example in memory or on disk or on CD-ROM, must be binary
    • for many technical aspects of programming, the details of data representation must be understood:
      • C programming - sending information over networks correctly
      • Unix - setting access permissions for files and directories
      • web page programming - setting colours for backgrounds, text, and links
      • assembler or machine language programming
  • important numbering systems
    • base 10 (decimal) is the numbering system used by most people
    • base 2 (binary) is used by most computers internally
    • base 8 (octal) and base 16 (hexadecimal) are used by programmers to more compactly represent binary numbers
  • in this lecture, we will see how numbers such as 237 and -84 are represented in computers

Number Systems

Base 10 - decimal numbering system

  • numbering system used by most people
  • in base 10, there are 10 numeric symbols, 0 to 9
  • when we write a number, such as 3572, the digits have different values depending on their position

    • 2 simply represents the number 2, because it is at the right-most end of the number
    • 7 represents the number 70
    • 5 represents the number 500
    • 3 represents the number 3000
    • formally, because this is base 10, each digit is multiplied by the appropriate power of 10

        3 x 103 = 3 x 1000 = 3000
        5 x 102 = 5 x  100 =  500
        7 x 101 = 7 x   10 =   70
        2 x 100 = 2 x    1 =    2
      
    • adding 3000 + 500 + 70 + 2 gives us 3572

Base 2 - binary numbering system

  • numbering system used by most computers
  • in base 2, there are 2 numeric synbols, 0 and 1
  • these symbols are called "bits", which is a contraction of binary digits
  • similar to the decimal system, in a binary number such as 10100100, the digits have different values depending on their position

    • this make it easy to convert binary numbers to decimal
    • because this is base 2, each digit is multiplied by the appropriate power of 2:

        1 x 27 = 1 x 128 = 128
        0 x 26 = 0 x  64 =   0
        1 x 25 = 1 x  32 =  32
        0 x 24 = 0 x  16 =   0
        0 x 23 = 0 x   8 =   0
        1 x 22 = 1 x   4 =   4
        0 x 21 = 0 x   2 =   0
        0 x 20 = 0 x   1 =   0
      
    • adding 128 + 0 + 32 + 0 + 0 + 4 + 0 + 0 gives us 164 in decimal

    • thus 16410 is equivalent to 101001002
  • to convert decimal numbers, for example 93, to binary:
    • there are formal approaches, but the following simpler method works fine for all but very large numbers
    • the least significant (right-most) digit in any numbering system has a positional value of 1
    • moving towards the left, list the binary values by multiplying the value to the immediate right by 2
    • for example, with 8 bits, this list of numbers is 128 64 32 16 8 4 2 1
    • find the largest binary positional value that is equal to or smaller than 93
    • this would be 64, so we know there is a "1" in the "64 position" 128 64 32 16 8 4 2 1 1
    • subtract 64 from 93, giving 29
    • find the largest binary positional value that is equal to or smaller than 29
    • this would be 16, so we know there is a "1" in the "16 position" 128 64 32 16 8 4 2 1 1 1
    • subtract 16 from 29, giving 13
    • find the largest binary positional value that is equal to or smaller than 13
    • this would be 8, so we know there is a "1" in the "8 position" 128 64 32 16 8 4 2 1 1 1 1
    • subtract 8 from 13, giving 5
    • find the largest binary positional value that is equal to or smaller than 5
    • this would be 4, so we know there is a "1" in the "4 position" 128 64 32 16 8 4 2 1 1 1 1 1
    • subtract 4 from 5, giving 1
    • it might be obvious by now that there is a "1" in the "1 position", and we have finished this part of the process 128 64 32 16 8 4 2 1 1 1 1 1 1
    • now place a "0" in all the remaining positions 128 64 32 16 8 4 2 1 0 1 0 1 1 1 0 1
    • thus 010111012 is the 8-bit binary equivalent to 9310

Base 8 - octal numbering system

  • numbering system often used by programmers to more easily represent binary numbers
  • in base 8, there are 8 numeric synbols, 0 to 7
  • similar to the decimal system, in an octal number such as 264, the digits have different values depending on their position

    • this make it easy to convert octal numbers to decimal
    • because this is base 8, each digit is multiplied by the appropriate power of 8:

        2 x 82 = 2 x 64 = 128
        6 x 81 = 6 x  8 =  48
        4 x 80 = 4 x  1 =   4
      
    • adding 128 + 48 + 4 gives us 180 in decimal

    • thus 18010 is equivalent to 2648
  • the real power of octal is its use in representing binary numbers
  • this is because one octal digit can exactly represent 3 bits:

      0 0 0  in binary is  0  in octal
      0 0 1                1
      0 1 0                2
      0 1 1                3
      1 0 0                4
      1 0 1                5
      1 1 0                6
      1 1 1                7
    
  • to convert a binary number, such as 11101010, to octal:

    • split up the binary number into groups of 3 bits, starting from the right side 1 1 / 1 0 1 / 0 1 0
    • if there are fewer then 3 bits in the left-most grouping, you may ignore this or add leading zeroes 0 1 1 / 1 0 1 / 0 1 0
    • leading zeroes don't change the value of a number, in any numbering system
    • now simply convert each 3-bit grouping to the corresonding octal digit 0 1 1 / 1 0 1 / 0 1 0 3 5 2
    • thus 3528 is equivalent to 111010102
  • to convert an octal number, such as 760, to binary:
    • reverse the above process, by replacing each octal digit with it's 3-bit equivalent 7 6 0 1 1 1 / 1 1 0 / 0 0 0
    • thus 1111100002 is equivalent to 7608
  • example of the use of octal numbers:
    • to set permission on a file, a command similar to "chmod 751 filename" might be used
    • the digits for the chmod command are octal, we will look at the details next week

Base 16 - hexadecimal numbering system (hex)

  • numbering system often used by programmers to more easily represent binary numbers
  • in base 16, there are 16 numeric synbols, 0 1 2 3 4 5 6 7 8 9 A B C D E F
  • similar to the decimal system, in a hex number such as A1C, the digits have different values depending on their position

    • this make it easy to convert hex numbers to decimal
    • because this is base 16, each digit is multiplied by the appropriate power of 16:

        A x 162 = 10 x 256 = 2560
        1 x 161 =  1 x 16 =    16
        C x 160 = 12 x  1 =    12
      
    • adding 2560 + 16 + 12 gives us 2588 in decimal

    • thus 258810 is equivalent to A1C16
  • the real power of hexadecimal is its use in representing binary numbers
  • this is because one hexadecimal digit can exactly represent 4 bits:

      0 0 0 0  in binary is  0  in hexadecimal
      0 0 0 1                1
      0 0 1 0                2
      0 0 1 1                3
      0 1 0 0                4
      0 1 0 1                5
      0 1 1 0                6
      0 1 1 1                7
      1 0 0 0                8
      1 0 0 1                9
      1 0 1 0  in binary is  A  in hexadecimal and  10  in decimal
      1 0 1 1                B                      11
      1 1 0 0                C                      12
      1 1 0 1                D                      13
      1 1 1 0                E                      14
      1 1 1 1                F                      15
    
  • to convert a binary number, such as 111110000, to hexadecimal:

    • split up the binary number into groups of 4 bits, starting from the right side 1 / 1 1 1 1 / 0 0 0 0
    • if there are fewer then 4 bits in the left-most grouping, you may ignore this or add leading zeroes 0 0 0 1 / 1 1 1 1 / 0 0 0 0
    • leading zeroes don't change the value of a number, in any numbering system
    • now simply convert each 4-bit grouping to the corresonding hex digit 0 0 0 1 / 1 1 1 1 / 0 0 0 0 1 F 0
    • thus 1F016 is equivalent to 1111100002
  • to convert a hex number, such as 3C9, to binary:
    • reverse the above process, by replacing each octal digit with it's 3-bit equivalent 3 C 9 0 0 1 1 / 1 1 0 0 / 1 0 0 1
    • thus 11110010012 is equivalent to 3C916
  • example of the use of hex numbers:
    • this html tag sets the colour for background on a web page
    • the digits are hex, in pairs, representing red, blue, and green intensities

Representing Signed Numbers in Binary

  • computers use a technique called "two's complement" to handle signed numbers, that is numbers that can be positive or negative
  • to represent a negative number, such as -93:
    • find the binary number for the positive number 93 (we did this above) 1 0 1 1 1 0 1
    • add at least one leading zero 0 1 0 1 1 1 0 1
    • invert all the bits, by turning 1's into 0's and 0's into 1's 1 0 1 0 0 0 1 0
    • now add 1 to the result 1 0 1 0 0 0 1 0 + 1 1 0 1 0 0 0 1 1
    • thus 101000112 is the 8-bit signed binary number equivalent to -9310
    • note that this also the 8-bit unsigned binary number equivalent to 16310
    • you MUST know the representation being used in order to be able to conver properly
  • to convert a negative binary number to it's positive value, the technigue is identical - invert the bits and add one
    • to find the number that 10100011 represents as a signed number: 1 0 1 0 0 0 1 1
    • invert all the bits, by turning 1's into 0's and 0's into 1's 0 1 0 1 1 1 0 0
    • now add 1 to the result 0 1 0 1 1 1 0 0 + 1 0 1 0 1 1 1 0 1
    • thus, we get back to 1011101

Permissions

  • permissions can only be changed by file owner or superuser (system administrator)
  • chmod is used to alter access permissions to an existing file or directory
  • the 9 permission bits displayed in an ls -al listing are read/write/execute for user(owner)/group/other
  • chmod xxx filename
    • xxx is 3 octal digits representing the binary string rwxrwxrwx where the first three characters are read/write/execute permission for the user, the next three for the user's group, and the last three for all others
    • eg. chmod 640 file1 - would give the user read and write permission, everyone in his group would have read permission, and all others would have no permission
    • this is called the "octal" or "absolute" method of changing permissions
  • chmod u+r filename
    • u represents user, could also be g for group, o for other, a for all
    • + represents addition of permission, could also be - for removal, = for set
    • r represents read permission, could also be w for write, x for execute
    • eg. chmod u+x file1 - would give the user execute permission in addition to whatever he had before
    • eg. chmod g-w file1 - would take away write permission from the user's group if they had it before
    • eg. chmod o=r file1 - would set all others' permission to read only regardless of what they had before
    • this is called the "symbolic" or "relative" method of changing permissions

Directory Permissions

  • r permission for a directory allows viewing of file names in the directory, but no access to the files themselves (regardless of the files' permission settings)
  • x gives passthrough permission for a directory, which allows access to any files in the directory which have appropriate permissions set, but doesn't allow viewing of file names in the directory
  • r and x permissions allow viewing of file names, and access to any files which have appropriate permissions set
  • w and x permissions allow adding or removing of files, but don't allow viewing of file names
  • r and w and x permissions allow viewing of file names, access to any files which have appropriate permissions set, and adding and removing of files

umask

  • umask defines default permissions for newly created files, doesn't change permissions on existing files
    • default permissions will be 777 minus umask for directories, remove any remaining executes for files
    • eg. umask - by itself, shows current umask setting
    • eg. umask 077 - new directories will be 700, new files will be 600
    • eg. umask 023 - new directories will be 754, new files will be 644