Přeložit do češtiny pomocí Google Translate ...

Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.

The goal of this lab is to introduce you to the Git command-line client and how to write reusable scripts.

Scripts

A script in the Linux environment is any program that is interpreted when being run (i.e., the program is distributed as a source code). In this sense, there are shell scripts (the language is the shell as you have seen it last time), Python, Ruby, etc. scripts.

The advantage of so-called scripting languages is that they do require only a text editor for development and that they are easily portable. Disadvantage is that you need to install the interpreter first. Fortunately, Linux typically comes with many interpreters preinstalled and starting with a scripting language is thus very easy.

Simple shell scripts

To write a shell script, we simply write the commands into a file (instead of typing them in a terminal).

As a simple example, we want to show a hexadecimal dump of a file that we download from the Internet. We will download the GIF sample from this page,

The commands would be

wget "http://www.matthewflickinger.com/lab/whatsinagif/images/sample_1.gif" -O /tmp/mf_sample.gif
hexdump -C /tmp/mf_sample.gif

Store the above in a file first.sh. Now cd into the directory with this file and run

sh first.sh

What happened?

Notice two things in the script: we have used quotes around the URL and we stored the file in /tmp. Last time, we have used quotes when there was a space in the filename. Generally, you would use quotes when there is a possibility that the argument could be “tricky”. We will talk about this later on when talking about variables and their expansion, for now remember that quotes around arguments are generally a safe way to prevent surprises.

Change the script to first execute cd /tmp and use relative file paths. Then run the script again. What happened? Have the script terminated in /tmp? Answer.

This is an essential take away – scripts (or any programs for that matter) can change their working (current) directory, but working directory is always local to the process (running program). Thus, when the program terminates, the caller (i.e., the shell) is still in the same directory.

If you want to see what is happening, run the script as sh -x first.sh. Try it now. For longer scripts, it is better to print your own messages as -x tends to become too verbose and it is rather a debugging aid.

To print a message to the terminal, you can use the echo command. With few exceptions (more about these later), all arguments are simply echoed to the terminal.

Create a script echos.sh with the following content and explain the differences.

echo alpha bravo charlie
echo alpha  bravo   charlie
echo "alpha   bravo"   charlie
Answer.

If you have some Python script on the disk, you can execute it as

python script.py

Executable bit

Running scripts by specifying the interpreter to use (i.e., the command to run the script file with) is not very elegant. Linux offers another way when we mark the file as an executable and Linux handles the rest.

Actually, when we execute the cat command or mc, there is a file (usually in the /usr/bin directory) that is named cat or mc and that is marked executable. Notice that there is no extension.

To have an idea about the amount of programs installed, look into /usr/bin.

To mark our first script as executable, simply run chmod +x first.sh. We will talk about other features of chmod (and access rights in general) later on, for now remember only chmod +x.

Run ls in the folder again. You should see first.sh now printed in green. If not, you can try ls --color or check that you have run chmod correctly.

When you type a command (e.g., cat), shell looks into so-called $PATH to actually find the file with the program. Unlike in other operating systems, shell does not look into the working directory when program cannot be found in the $PATH.

To run a program in the current directory, we need to specify its path. Luckily, it does not have to be an absolute path, but a relative one is sufficient. Thus, we need to execute

./first.sh

Try it yourself.

If you are in a different directory, running ../first.sh (or similar) would work too.

Shebang (hashbang)

Create the simplest Python program now. It should only contain print("Hello") without main or without any other content.

Store it into the file, make the script executable, and run it. Hint.

The result is not very satisfying, but the reason is very simple. Linux executed this script as a shell one!

To fix that, we need to specify which interpreter to use. This is done via so-called shebang or hashbang. As a matter of fact, you have already encountered it several times.

If the first line of the script starts with #! (hence the name hash and bang), Linux expects a path to the interpreter after it and will use this interpreter instead of the default sh.

It is a good practice to specify the interpreter always and never rely on the default fallback to the shell script.

For shell scripts, we will be using #!/bin/bash, for Python we need to use #!/usr/bin/env python3.

In small print, note that most interpreters use # to denote a comment which means that no extra handling is needed to skip the first line (as it is really not needed by the interpreter).

You probably noticed that when we have executed sh script we have used sh and not bash. You are completely right and at this moment, sh or bash would make no difference (sh refers to the original shell born in the seventies, bash is its improved version). Later on, we will be using some advanced features that are not present in plain sh and it is just easier to remember now to use /bin/bash there.

For Python and other more complex languages, you will often see the variant with env and python3 (or ruby). We will talk about this later on, for now, please just remember to use this version.

Fix hello.py from the beginning of this section and run it again. Answer.

Command-line arguments

Command-line arguments (such as -l for ls or -C for hexdump) are the usual way to control the behaviour of CLI tools in Linux. For us, as developers, it is important to learn how to work with them inside our programs.

We will talk about using these arguments in shell scripts later on, today we will handle them in Python.

Accessing these arguments in Python is very easy. We need to add import sys to our program and then we can access these arguments in the sys.argv list.

Write a program that prints its arguments. Answer.

Let us execute it.

./args.py
./args.py one two
./args.py "one two"

Note that the zeroth index is occupied by the command itself (we will not use it now, but it can be used for some clever tricks) and notice how the second and third command differs from inside Python.

Other interpreters

Look at the following script and explain what it does (we will need it later).

#!/usr/bin/env python3

import sys

def run_with_file(input_file):
    total = 0
    for line in input_file:
        line = line.strip()
        if (not line) or line.startswith('#'):
            continue
        parts = line.split()
        if parts[0] == 'echo':
            print(total)
        elif parts[0] == 'add':
            total += int(parts[1])
        else:
            print("Unknown command ('{}')!".format(parts[0]))

def main():
    if len(sys.argv) != 2:
        print("Run with exactly one argument - filename with commands.")
        return
    with open(sys.argv[1]) as inp:
        run_with_file(inp)

if __name__ == '__main__':
    main()
Answer.

We will now try which interpreters we can try to put into the shebang.

Construct an absolute (!) path (hint: man 1 realpath) to the args.py we used above. Use it as a shebang on an otherwise empty file (e.g. use-args) and make this file executable. Hint.

And now run it like this:

./use-args
./use-args first second

You will see that the argument zero now contains a path to your script. Argument on index one contains the outer script – use-args and only after these items are the actual command line arguments (first and second). This is essential – when you add a shebang, the interpreter receives the input filename as the first argument. In other words – every Linux-friendly interpreter shall start evaluating a program passed to it as a filename in the first argument.

While it may seem as an excercise in futility, it demonstrates an important principle: GNU/Linux is extremely friendly towards the creation of mini-languages. If you need to create an interpreter for your own mini-language (such as the summation one at the beginning of this section), you only need to make sure it accepts the input filename as the first argument. And voila, users can create their own executables on top of it.

As another example, prepare the following file and store it as experiment (i.e., no file extension) and make the file executable.

#!/bin/bash

echo Hello

Note that we decided to drop the extension again altogether. The user does not really need to know which language was used. That is captured by the shebang, after all.

Now change the shebang to #!/usr/bin/cat. Run the program again. What happens? Now run it with an argument (e.g., ./experiment experiment). What happened? Answer.

Change the shebang to /usr/bin/echo. What happened? Answer.

Git on command-line

This section will describe how to use Git on the command-line as opposed to using the GUI superstructure offered by GitLab. We already described the motivation for both Git and GitLab in Lab #1. Here we will show how to access the files from the command-line to improve your experience when using Git.

While it is possible to edit many files on-line in GitLab, it is much easier to have them locally and use a better editor (or IDE). Furthermore, not all tools have their on-line counterparts and you have to run them locally.

Therefore, Git offers a command-line client that can download the whole project to your machine, track changes in it, and then upload it back to the server (GitLab in our case but there are other products too).

As you will see, the whole project as you see it on GitLab becomes a directory on your hard-drive and the whole process of submitting changes is much easier. As usual, there are also GUI alternatives to the commands we will be showing here, but we will devote our attention to the CLI variants only.

Setting your editor

Git will often need to run your editor. It is essential to ensure it uses the editor of your choice.

We will explain following steps in more detail later on, for now ensure that you add the following line to the end of ~/.bashrc file (replace mcedit with editor of your choice).

export EDITOR=mcedit

Now open a new terminal and run (including the dollar sign)

$EDITOR ~/.bashrc

If you set the above correctly, you should see again .bashrc opened in your favorite text editor.

You need to close all terminals for this change to make an effect (i.e., before you start using any of the Git commands mentioned below).

Manpages for Git

Git CLI client is generally used as

git subcommand --options-for-subcommand

That is, you always run git and its first argument is name of the Git command to execute.

Manual pages for Git are split into separate pages named git-subcommand (that is, the command config is documented in man 1 git-config).

You can also run git subcommand --help or even git help subcommand.

Configure Git

One of the key concepts in Git is that each commit (change) is authored – i.e., it is known who made it. We will skip commit signing here and will not be considering identity forge/theft here.

Thus, we need to tell Git who we are. The following two commands are the absolute minimum you need to execute on any machine (or account) where you want to use Git.

git config --global user.name "My real name"
git config --global user.email "my-email"

The --global flag specifies that this setting is valid for all Git projects. You can change this locally by running the same command without this flag inside a specific project. That can be useful to distinguish your free-lance and corporate identity, for example.

Note that Git does not check the validity of your e-mail address or your name (indeed, there is no way how to do it). Therefore, anything can be there. However, if you use your real e-mail address, GitLab will be able to pair the commit with your account etc. which can be quite useful.

The decision is up to you.

Cloning for the first time (git clone)

For the following example, we will be using the repository teaching/nswi177/2021-summer/common/csv-templater.

Fork this repository to your own namespace (in GitLab via web browser) first. Hint.

Forking a project means creating a copy for yourself on GitLab. Create the fork – you do not have write access to our repository and we do not want you to fight over the same files anyway.

Move to your (forked) project and click on the blue Clone button. You should see Clone with SSH and Clone with HTTPS addresses.

Copy the HTTPS address and use it as the correct address for the clone command.

git clone https://gitlab.mff.cuni.cz/YOUR_LOGIN/csv-templater.git

The command will ask you for your username and password. As usual with our GitLab, please use the SIS credentials.

Note that some environments may offer you to use some kind of a keyring or another form of a credential helper. Feel free to use them, later on, we will see how to use SSH and asymetric cryptograhpy for seamless work with Git projects without any need for username/password handling.

Note that you should have the csv-templater directory on your machine now. Move to it and see what files are there. What about hidden files? Answer.

Unless stated otherwise, all commands will be executed from the csv-templater directory.

Making a change (git status and git diff)

Fix typos on line 11 in the Python script and in the README.md and run git status before and after the change. Read carefully the whole output of this command to understand what it reports.

UPDATE In the process of cleaning the script we also removed the typos ;-). Thus for the emulated fix, change (with Python formatting) to (with Python-style formatting). Sorry.

Create a new file, demo/people.csv with at least three columns and 4 rows. Again, check how git status reports this change in your project directory.

What have you learned? Answer.

Run git diff to see how Git tracks the changes you made. Why this output is suitable for source code changes?

Note that git diff is also extremely useful to check that the change you made is correct as it focuses on the context of the change rather than the whole file.

Making the change permanent (git add and git commit)

Now prepare for your first commit (recall that commit is basically a version or a named state of the project) – run git add csv_templater.py. We will take care of the typo in README.md later.

How git status differs from the previous state? Answer.

Make your first commit via git commit. Do not forget to use a descriptive commit message!

Note that without any other options, git commit will open your text editor. Write the commit message there and quit the editor (save the file first). Your commit is done.

For short commit messages, you may use git commit -m "Typo fix" where the whole commit message is given as argument to the -m option (notice the quotes because of the space).

How will git status look like now? Think about it first before actually running the command.

Sending the changes to the server

We will now propagate your changes back to GitLab by using git push. It will again ask for your password and after that, you should see your changes on GitLab.

Which changes are on GitLab? Answer.

Excercise

Add the second typo as a second commit from the command-line.

By the way, have you tried running the CSV templater? Add the following example to the README as the third commit.

./csv_templater.py -t demo/breed.txt demo/patrol.csv

As another commit, add the CSV file with extra data you created some time ago. Hint.

Push now the changes to GitLab. Note that all commits were pushed at the same time.

Browsing through the commits (git log)

Investigate what is in the Repository -> Commits menu in GitLab. Compare it with the output of git log and git log --oneline.

Getting the changes from the server

Add another example to the README but this time make the change on GitLab.

./csv_templater.py -t demo/call.txt -o "call-{name}.txt" demo/patrol.csv

To update your local clone of the project, execute git pull.

Note that git pull is quite powerful as it can incorporate changes that happened virtually at the same time in both GitLab web UI as well as in your local clone. However, understanding this process requires also knowledge about branches, which is out-of-scope for this lab.

Thus for now, remember to not mix changes locally and in GitLab UI (or on a different machine) without always ending with git push and starting with git pull.

Other bits

Again, several assorted notes that do not fit into the sections above but are worth knowing.

If you do not have a colorful terminal (unusual but still possible), you can use ls -F to distinguish file types: directories will have a slash appended, executable files will have a star next to their filename.

Running ./tools/run_tests.sh from inside your clone of your grading repository (i.e., student-YOUR_LOGIN) will run the tests that are normally executed in GitLab.

Refer to the graded tasks section for another take on the tests that do not require downloading all tests again and again.

Graded tasks

From now on, prefer to use the command-line client for submitting the tasks.

Using Git CLI (15 points)

Use git config to temporarily change your e-mail to YOUR_SIS_LOGIN@gitlab.mff.cuni.cz (surely, replace YOUR_SIS_LOGIN with the right one) and make one commit to your graded task repository with this e-mail. You can create a new file 03/git_cli.txt if you do not know what to change ;-). Add the word graded-task to the commit message, please.

This task is not automatically checked by the nswi177-tests pipeline (CI) on GitLab.

03/scoring.py (40 points)

Write a Python program that recognizes the following mini-language for computing tournament scoring.

add team-zulu task1 5
add team-alpha task1 10
add team-alpha task2 5
add team-bravo task1 10
add team-yankee task1 5
summary After first week
csv week1.csv
add team-zulu task2 15
summary End of tournament
podium

We expect that your program could be used in a shebang (recall how sys.argv shall be used) for such data/program. That is, adding #!/absolute/path/to/03/scoring.py and chmod +x-ing it would allow to run the above as a script.

We expect that the above results in (note the ordering)

After first week
  team-alpha: 15
  team-bravo: 10
  team-yankee: 5
  team-zulu: 5
End of tournament
  team-alpha: 15
  team-bravo: 10
  team-yankee: 5
  team-zulu: 20
Medal podium
  team-zulu
  team-alpha
  team-bravo

and the file week1.csv would contain (again, orderd by team name)

team,score
team-alpha,15
team-bravo,10
team-yankee,5
team-zulu,5

Your program does not need to handle wrong input or CSV file creation issues (such as path to non-existent directory etc.).

You can choose to implement only part of the assignment, we consider the commands podium and csv as extras, most points will be awarded for add and summary (they really does not make sense without each other). podium command should print the teams sorted by the points (if two teams have the same amount of points, the order doesn’t matter).

Feel free to reuse the code from us or as a starting point (but not the code of your mates in the course).

Update: any text after summary is supposed to be copied as-is, i.e. it represents a user-defined title. For podium, the text is always Medal podium, text after csv represents file path.

03/git-identity.sh (15 points)

Write a shell script (including the executable bit and the right shebang) that prints your Git identity. That is, your Git username and your Git e-mail.

Do not use --global flag inside the script to allow for testing.

Update: use git commands, do not try reading from ~/.gitconfig or similar.

Update 2: print username and e-mail each on its own line.

03/tests.txt (30 points)

Explain what the following program does in your own words.

If you do not know some of the commands, look up their meaning in their manpages.

Among other things, answer from which directory would you run this script and what (in broad terms only) the command bats __tests/[01][0-9].bats do?

Note that the point of using man is not to learn everything about the command. Instead, use it to get the idea of what the program does: does it download a file from the Internet, does it convert different types of images, etc.

#!/bin/bash

rm -rf __tests
mkdir __tests
cd __tests
wget https://d3s.mff.cuni.cz/f/teaching/nswi177/tests.tar.gz
tar xzf tests.tar.gz
cd ..
bats __tests/[01][0-9].bats

Deadline: April 5, AoE

Solutions submitted after the deadline will not be accepted.

Note that at the time of the deadline we will download the contents of your project and start the evaluation. Anything uploaded/modified later on will not be taken into account!

Note that we will be looking only at your master branch (unless explicitly specified otherwise), do not forget to merge from other branches if you are using them.