http://web.eecs.utk.edu/~jplank/plank/classes/cs140/Notes/Running_Times/voidstar.html CS140 Lecture notes -- Implementing a class with a void * * James S. Plank * Directory: /home/plank/cs140/Notes/Running_Time * Lecture notes: http://web.eecs.utk.edu/~jplank/plank/classes/ cs140/Notes/Running_Time/voidstar.html * Original notes: Fall, 2019 * Last modified: Wed Nov 20 15:31:13 EST 2019 --------------------------------------------------------------------- Before you read This is not an "industry-standard" way to program in C++. In fact, it circumvents many of the nice features of C++, the most important of which is using the type-checking abilities of your compilers. When you learn inheritance and interfaces next semester, you may well feel that what I'm teaching here is wrong. However, I don't believe it's wrong, and this is indeed how my research group's classes work. Hiding information behind a (void *) is, in my opinion, a valuable thing to do, so I teach it to you. --------------------------------------------------------------------- A motivating example: Implementing a histogram data structure We're going to define and implement a C++ class to help us generate histograms from data. This class has three important methods: 1. Set_Bin_Size() -- our histogram will organize data by "bins", which represent regions of data. For example, if our bin size is 10, then bin 0 will represent all data values between 0 and 10 (not including 10), bin 1 will represent all data values between 10 and 20 (not including 20), etc. 2. Add_Value() -- this adds a value to our histogram. Basically, our data structure will figure out which bin it belongs to, and then it will add one to the bin. 3. Get_Data() -- this gives us our histogram data: what the bins are and how many data items are in each bin. You could use this information to make graphs (like the ones that I make to show how students have done on an exam.) Here's the entire class definition, in include/histogram.hpp: +---------------------------------------------------------------------------------------------------+ | #pragma once | | | | #include | | | | class Histogram { | | public: | | | | /* Constructors, destructor, assignment overload. */ | | | | Histogram(); | | ~Histogram(); | | Histogram(const Histogram &h); | | Histogram& operator= (const Histogram &h); | | | | void Clear(); // This clears the histogram's data | | // but retains the bin size if one has been set. | | | | bool Set_Bin_Size(double bs); // Returns false if bs is <= 0, or if histogram is non empty. | | double Get_Bin_Size() const; // Return -1 if the bin size has not been sent. | | bool Add_Value(double d); // Add a new value to the histogram. Returns false if | | // the bin size has not been set. | | | | /* Get_Data() creates these two vectors. There will be an element in each | | vector for every non-empty bin in the histogram. bin_ids[i] will contain | | the "id" of the bin, where an id of 0 corresponds to values between [0 and | | Bin_Size), an id of 1 corresponds to values between [Bin_Size and Bin_Size*2), | | etc. bin_ids will be sorted. num_elts[i] contains the number of data points | | in bin i. | | */ | | | | bool Get_Data(std::vector &bin_ids, std::vector &num_elts) const; | | | | protected: | | void *state; // This is so that whoever uses the data structure | | // does not know how it is implemented. | | }; | +---------------------------------------------------------------------------------------------------+ Everything is straightforward, with the exception of the one protected variable, state. You'll note it's a (void *). A (void *) is a pointer, where you don't know what it points to. This is a good thing, because the person who is reading the class definition does not know what it contains. Therefore, if you use this class, and are not privy to the source code that implements it, you don't know how it's implemented. Why is this nice? Because those who use the class cannot mess it up. And it's not really important how it's implemented, so long as the implementation is correct and efficient. --------------------------------------------------------------------- Three programs that use the class I have three programs that use the Histogram class: 1. src/histogram_tester.cpp -- this is one of my standard line-based testers that lets you make sure that the data structure is working as it should. 2. src/data_to_histogram.cpp -- this takes a bin size on the command line, and then reads words from standard input. Every word that is a number is inserted into the histogram, and then the histogram is printed at the end. 3. src/range_tester.cpp -- this generates random numbers and inserts them into the histogram. It times this activity. It then calls Get_Data(), and times that. It prints the timings, and if you want, it will also print the histogram. For this lecture, we will only concentrate on histogram_tester. I have compiled it with an implementation of the class which is in src/ histogram_vector.cpp into the binary bin/ht_vector. Let's walk through how it works: UNIX> make clean rm -f obj/* bin/* UNIX> make bin/ht_vector g++ -std=c++98 -Wall -Wextra -Iinclude -c -o obj/histogram_tester.o src/histogram_tester.cpp g++ -std=c++98 -Wall -Wextra -Iinclude -c -o obj/histogram_vector.o src/histogram_vector.cpp g++ -std=c++98 -Wall -Wextra -Iinclude -o bin/ht_vector obj/histogram_tester.o obj/histogram_vector.o UNIX> bin/ht_vector # Take a look at the commands usage: histo_tester prompt(- for empty) -- commands on stdin. commands: SET_BIN bin_size - Call Set_Bin(bin_size). ADD_VALUES v1 v2 ... - Call Add_Value(v) for each vi. PRINT - Get Get_Data() and print. CLEAR - Calls the Clear() method. DESTROY - Call the destructor and remake an empty histogram. PRINT_COPY - Print a copy, thus testing the copy constructor. ASSIGN - Use the assignment overload to make a copy. QUIT - Quit. ? - Print commands. UNIX> bin/ht_vector '-->' --> SET_BIN 10 # We're using a bin size of ten. --> ADD_VALUES 3 4 5 6 15 23 41 42 # Four values into 0-10, one each into 10-20 and 20-30, --> PRINT # plus two values into 40-50 0 4 10 1 20 1 40 2 --> CLEAR # This clears the histogram, but keeps the bin size at 10. --> ADD_VALUES 101 4.5 # One value into 0-10 and one into 100-110 --> PRINT 0 1 100 1 --> QUIT UNIX> --------------------------------------------------------------------- Implementing the class with a double and a vector There are many ways that we can implement this class. We will explore the different implementations in the Running Time Lecture Notes. In this lecture we focus on one implementation, which is in the file src /histogram_vector.cpp. In this implementation, we define a class called Histo_Vector: +------------------------+ | class Histo_Vector { | | public: | | vector Elts; | | double Bin_Size; | | }; | +------------------------+ This will implement the Histogram class as follows: When I call Add_Value(v), I will calculate the bin number from v, I will use that as an index to the vector Elts. If Elts isn't big enough, I will resize it (putting zeros in the new entries). I will then increment Elts[bin]. I can implement Get_Data() by traversing Elts, and calling push_back () on ids and num_elts whenever Elts[i] is greater than zero. That's straightforward, but we'll get to that code later. Instead, let's look at the constructor and destructor: +--------------------------------------------------------------------------------------------------------+ | /* The constructor allocates an instance of | /* The destructor calls the Histo_Vector | | Histo_Vector, and then sets the state member | destructor, which will clear out the Elts | | variable to that instance. */ | vector, and release the memory of the instance.*/ | | | | | Histogram::Histogram() | Histogram::~Histogram() | | { | { | | Histo_Vector *hv; | Histo_Vector *hv; | | | | | hv = new Histo_Vector; | hv = (Histo_Vector *) state; | | hv->Bin_Size = -1; | delete hv; | | | } | | state = (void *) hv; | | | } | | +--------------------------------------------------------------------------------------------------------+ When the constructor for Histogram is called, the only variable that has been allocated it state, and it is uninitialized. We need to create an instance of Histo_Vector, so we do that with new. And then we store it in state. Whenever we call a method of the Histogram class, the very first thing we'll do is declare a pointer to a Histo_Vector, and then set it equal to state. This is how the Histogram class only uses that one (void *), that the users know is there, but they don't know what it contains. In other words, we know what the state is, but the users don't. As you can see above, the destructor does just what I said -- it declares a variable hv, which is a pointer to a Histo_Vector, and it sets it equal to state. It then calls delete, which deallocates all of its memory. Below, I show the implementations of the three easy methods: Clear(), Set_Bin_Size() and Get_Bin_Size(): +--------------------------------------------------------------------------------------------------------------------------------------+ | /* Clear() gets rid of the data by | /* Set_Bin_Size() error checks its argument, | /* Get_Bin_Size simply returns the bin size | | clearint the Elts vector. */ | and also error checks that there no data | from the Histo_Vector. */ | | | (because it wouldn't make sense to change | | | | the bin size if we have already assigned | | | | data to bins). It then sets the bin size. */ | | | | | | | void Histogram::Clear() | bool Histogram::Set_Bin_Size(double bs) | double Histogram::Get_Bin_Size() const | | { | { | { | | Histo_Vector *hv; | Histo_Vector *hv; | Histo_Vector *hv; | | | | | | hv = (Histo_Vector *) state; | hv = (Histo_Vector *) state; | hv = (Histo_Vector *) state; | | hv->Elts.clear(); | | return hv->Bin_Size; | | } | if (bs <= 0) return false; | } | | | if (hv->Elts.size() != 0) return false; | | | | hv->Bin_Size = bs; | | | | return true; | | | | } | | +--------------------------------------------------------------------------------------------------------------------------------------+ The point of this code that I want to stress to you is how in each method, the very first thing that we do is declare a pointer to a Histo_Vector, and set it from state. This is highlighted in red in each method above. Once we have done this, we can work with the (Histo_Vector *). Add_Value() works as described above. It error checks, then it figures out the bin, then it makes sure that the vector has the bin defined, and it adds one to the bin's element in the vector: +-------------------------------------------------------------------------------------------------+ | bool Histogram::Add_Value(double d) | | { | | Histo_Vector *hv; | | int bin; | | | | hv = (Histo_Vector *) state; | | | | if (d < 0) return false; // Error check | | if (hv->Bin_Size < 0) return false; | | | | bin = (int) floor(d/hv->Bin_Size); // Calculate the bin and make sure there's room | | if (bin >= (int) hv->Elts.size()) { // for it in the Elts vector. | | hv->Elts.resize(bin+1, 0); | | } | | | | hv->Elts[bin]++; // Add one to the bin. | | return true; | | } | +-------------------------------------------------------------------------------------------------+ And Get_Data() simply creates the ids and num_elts vectors from Elts. If Elts[i] is equal to zero, then we ignore that bin: +------------------------------------------------------------------------------------------------------+ | bool Histogram::Get_Data(vector &bin_ids, vector &num_elts) const | | { | | Histo_Vector *hv; | | size_t i; | | | | hv = (Histo_Vector *) state; | | | | if (hv->Bin_Size < 0) return false; // If we haven't set the bin size yet, return an error.. | | | | bin_ids.clear(); // Clear out the vectors if they have any elements. | | num_elts.clear(); | | | | for (i = 0; i < hv->Elts.size(); i++) { // Push the bins onto id's and Elts[bin] onto num_elts. | | if (hv->Elts[i] != 0) { | | bin_ids.push_back(i); | | num_elts.push_back(hv->Elts[i]); | | } | | } | | return true; | | }; | +------------------------------------------------------------------------------------------------------+ Finally, the assignment overload simply copies the Histo_Vector from one Histogram to the other. The copy constructor sets up the state and then uses the assignment overload to copy the Histo_Vector: +------------------------------------------------------------------------------------------------------------+ | /* The copy constructor creates a new | /* The assignment overload allocates | | Histo_Vector, and then uses the | a new instance of Histo_Vector, and | | assignment overload to copy from h. */ | copies it from h's Histo_Vector. */ | | | | | Histogram::Histogram(const Histogram &h) | Histogram& Histogram::operator= (const Histogram &h) | | { | { | | Histo_Vector *hv; | Histo_Vector *hv_original, *hv_copy; | | | | | hv = new Histo_Vector; | hv_original = (Histo_Vector *) h.state; | | state = (void *) hv; | hv_copy = (Histo_Vector *) state; | | *this = h; | *hv_copy = *hv_original; // This copies Elts and Bin_Size. | | } | return *this; | | | } | +------------------------------------------------------------------------------------------------------------+ There's more to this lecture in the Running Time Lecture Notes, where we implement the Histogram class in five different ways and then compare them. All of them work like src/histogram_vector.cpp, creating their own classes, and then storing pointers to them in state.