Table of Contents
Warmup
main()
By convention (and long since standardized), the function that is first invoked when a C++ program is run is the function named main()
. In lecture and in previous assignments, we have seen a very basic use of
main()
#include <print>
int main() {
std::println("Hello World!");
return 0;
}
Returning from main()
We have noted a few important things about main()
.
First, main()
is a function just like any other; the only thing that makes it special is its name – the function with that name is the one that will be called when your program is run. As “just a function”, main
has a return type (in this case, int
) and so must also return a value of that type. As with any other function, a return value is returned to the calling function. In the case of main()
, there isn’t another function in your program that called main()
, but the value is still returned. In particular, it is returned to the program that invoked your program – typically the shell (bash).
But where does that value go and how do you access it? If you run any executable (here ./test
), you will be able to access the returned value by:
./test
echo "./test returned $?"
These lines follow the execution of a program. The return value from the program – the return value from that program’s main()
function – is accessed with the special shell macro $?
. (Other shells have different macros – if you use csh or tcsh, the macro is called $status
).
In your starter repository there is the file hello.cpp
. As supplied, it returns 0 from main()
. Build the executable (see Build and Run Code using VSCode), and run it as:
$ ./build/hello
$ echo $?
What is printed in response to the echo statement?
Next, change hello.cpp
so that main()
returns a non-zero value (say, 1
or -1
). Recompile hello
and re-run the program again:
$ ./hello
$ echo $?
What is returned in this case?
In the command-line world of Linux/Unix/etc., programs are often invoked by other programs (e.g., bash scripts) and the return values from executing programs are used to signal a successful execution (return 0
by convention means ‘success’) or a unsuccessful execution (by convention, return a non-zero error code). Actual program output is generally returned as ASCII strings to the standard output file (in C++ this is the std::cout
or std::print[ln]
without specifying a target output) and read into the standard input (std::cin
). Error messages are sent to a different stream from the standard output (std::cerr
or stderr
if you use std::print
or std::println
). Both std::cout
and std::cerr
will by default print to your terminal screen, but they can be separately directed when you connect the inputs and outputs of programs together.
Calling main() with arguments
Command-line programs also can take parameters on the command line. For example, if you want to list the files in the director “/usr”, you would invoke:
$ ls /usr
Here, ls
is simply a program (by convention, Unix/Linux programs do not have a .exe extension). If you look in the /usr/bin directory, in fact, you will see the executable file for that command
$ ls /usr/bin/ls
Shell programs like bash operate in a ‘read-execute-print’ loop (REPL). The shell takes all of characters typed on the line before you hit enter, parses it, executes the command given (the first argument on the
line), and prints the result. To execute the first command, the shell searches the directories contained in the $PATH
environment variable and runs the program if it is found.
You can see what your path is by printing the $PATH
variable:
$ echo $PATH
Now, all of the programs that you might run (do an ls
on, say, /usr/bin
or any other directory in your path) have a main()
function (assuming they are C or C++ programs, which most programs in Linux/Unix are). So they will return 0
or non-zero values (you can verify this by running some programs and checking $?
)
$ ls /usr/bin
$ echo $?
$ ls /nosuchdirectory
$ echo $?
$ nosuchcommand
$ echo $?
But, these programs are doing more than what we have been doing so far: they take arguments. That is,
$ ls /usr/bin
lists the contents of the directory /usr/bin
. So, somehow, the argument /usr/bin
is being passed into the program – more precisely, somehow, /usr/bin
is being passed to the main()
function in the ls
program.
The main()
function we have seen to this point doesn’t have any arguments, so it isn’t obvious how argument passing would be done. However, main()
is actually an overloaded function and either of the
two version of main()
will be called when your program is run. (You can only have a single main()
– you must use one or the other of the overloads but you can’t use both.)
The overload of main()
that takes arguments is declared thusly:
int main(int argc, char *argv[]);
The first argument traditionally named argc
contains the number of arguments that are passed to your main()
function. The second argument is an array of strings, each string of which is one of the arguments to your program.
Here is where we are forced to make a concession to the use of pointers. In C/C++, a pointer is the address in memory of a datum and is denoted with the
*
preface. Technically,argv
is an array of pointers, which are essentially memory addresses, and the data being pointed to is specified to be interpreted as character types when accessed by that pointer. In C++,char
pointers can be used more or less interchangeably withstd::string
and so we will be using contents ofargv[]
in that way.
If we have our arguments in an array of strings and we know how many entries are in the array, we can loop through the array and read and interpret each entry. In your starter repository is a file repeat.cpp
:
#include <cstddef>
#include <print>
int main(int argc, char *argv[]) {
for (size_t i = 0; i < argc; ++i) {
std::println("argv[{}]: {}", i, argv[i]);
}
return 0;
}
Build repeat
and try the following. First, execute the program with no arguments at all:
$ ./build/repeat
What is printed?
Now execute the program with whatever arguments you like, e.g.,
$ ./build/repeat whatever arguments you like
What is printed?
Finally, note that "."
is your current directory. We specify "."
in front of the programs we compile locally because the shell will not find it by searching in its path. That is:
$ repeat
Will not find repeat
.
But, the directory where repeat
exists is part of the file system directory structure on your computer and that location can be specified other than with "."
. If you run the command
$ pwd
bash will print your current working directory. The current working directory is also stored in the environment variable cwd
. So for instance, in my own case, pwd
might print this
$ echo $cwd
/home/csc4700/work/
(your working directory may be different).
Instead of using "."
(which is a relative path), use the value of $cwd
(which is an absolute path)
$ /home/csc4700/work/build/repeat
$ /home/csc4700/work/build/repeat some arguments or other
(You will need to specify the correct path for your development environment.)
What is printed?
Warm Up
main()
As we develop more sophisticated programs through the rest of this semester, we will need to pass arguments to main()
– and to interpret those arguments in different ways. For example, if we want to pass in the size of a problem to run, we would put a number on the command line
$ ./program 1024
Here, "1024"
would be passed to the main()
function in this program. But note that it is the string "1024"
, not the number 1024. For example, what happens if you try to compile the following program:
#include <print>
int main(int argc, char* argv[]) {
int size = argv[1];
std::println("size is {}", size);
return 0;
}
To set the value of size
to be a number, we have to convert (“parse”) the string in argv[1]
into an integer and assign the resulting value to size
.
Modify main.cpp
so that it correctly assigns an integer value to size
. You may find the functions std::atoi
or std::atol
or std::stol
or similar to be useful. If you do end up using a library function to do the conversion from a character string to a number, you will also need to include the appropriate header file (the header file containing the declaration for that function) as well as invoke it with the proper namespace (e.g., std
). Please consult the documentation for these functions to understand what header file needs to be #include
d.
In the rest of this assignment, we will be converting arguments from the command line from strings into the appropriate type (integer, etc). Accordingly, the code samples for the rest of this assignment don’t give away how to do that but rather have a
FIXME
comment.
There is another problem with main.cpp
. Once you have made your modification, what happens if you don’t pass any argument at all to your program?
$ ./main
We need to do a little bit of error checking. One can do arbitrarily sophisticated argument checking and error handling, but we can do a few simple things that will serve us well for this course. The simplest, which is what we will do for this problem set, is to simply make sure we have the right number of arguments:
#include <cstdio>
#include <print>
int main(int argc, char* argv[]) {
if (argc != 2) {
std::println(stderr, "Usage: {} size", argv[0]);
return -1;
}
// FIXME: change this code to properly convert argv[1] to the size_t 'size'
int size = 0; // convert from argv[1]
std::println("size is {}", size);
return 0;
}
Until we introduce some more sophisticated approaches, you should use an approach like this for programs you write that take command-line arguments. Grading scripts for such programs will check that you handle the wrong number of command-line arguments correctly. A typical approach, as shown above, is to issue a “usage” message and exit with a non-zero exit code.
For now we will assume that the arguments themselves are well-formed (that is, you won’t be required to check that an argument that is supposed to be a number is actually a well-formed number).
Command line argument parsing is a common and important enough task in programming that numerous libraries exist to help that process. A very basic one is part of the C standard library – the getopt
function. One that I have found to be useful but without too much overhead in either learning or in use is called “docopt”.
Questions
Answer the following questions in results/answers.md
- What does
argv[0]
always contain? - Which entry of
argv
holds the first argument passed to the program? - Which entry of
argv
holds the second argument passed to the program? - How would you print just the last argument passed to a program?
Next up: Performance