project page
mailing list

The Database Schema: How the database is set up

Enzyme consists of a mySQL database at the core, with PHP3 (with JavaScript) for the information gathering side and for the search+display side. The data are stored orthogonally. Important tables to know about are 'auth_user' (for handling user logins, PHPLIB standard), 'user' (contains first and last name, some other stuff), and the grandaddy of them all, 'student'. The student table isn't very big or anything, but it's important because the studentid field is a unique identifier which is used in all the tables that contain data corresponding to a particular student.

Under our existing model of student resumes ("profiles"), there are 10 basic categories: Personal, Education, Objectives, Preferences, Skills, Experience, Activities, Test Scores, Honors, and Portfolio. Many of these categories have unique quirks and needs for their implementations. Right now these quirks are for the most part handled on the individual pages, with a bunch of messy code. However, a more or less typical category such as Activities works as follows. In the database, we have an 'activity' table and a 'student_activity' table. The activity table contains the information for the activity itself: name, type, etc. The student_activity table contains a studentid (a key into the student table), an activityid (corresponding to the actvity in the activity table), and the information for that activity specific to that student, ie, position, hours per week, start date, end date, etc. Sometimes the relations get more complicated, such as in the Education category. In this category, we have the 'school' table and the 'student_school' table. Plus there are also the 'degree' (only a few entries, one for BA, BS, MA, etc) and 'student_degree' tables, and the 'concentration' and 'student_concentration' tables. The 'concentration' table contains the names of all the majors, which can be shared across different students.

So you get the basic idea of how the relations work in the various categories. One way of looking at it is that you have basically three different types of tables.

1.) Some are "locked" tables, which are editable only by an admin, e.g., 'degree', 'activity_type', 'ethnicity'--things that are known in advance. These are useful for grouping records in "editable-shared" (see below) tables. E.g., for the 'company' table, we have the 'sector' (eg, software, finance, retail, etc) table; for the 'concentration' table, we have the 'concentration_type' (eg, social science, humanities, arts, etc) table. This grouping makes for useful searches.

2.) Then there are "editable-shared" tables, such as 'company', 'activity', 'school', and 'skill'. These are the trickiest to deal with. Anyone can add to them--for instance, if the person goes to a school we don't have in the database, she can add that school to the school table. When the next person from that school comes, he can now select that school, as added by the previous person, from a pop-up menu. (It's a little more complicated than that, but that's the basic idea.) Many site admins may feel uncomfortable with this arrangement, because you have to trust the user not to put inaccurate, incomplete, or bogus data into the database. However, we have found this approach to be effective even with the disadvantage. When it works, you save the users a lot of work, because you don't make them re-enter the same data over and over. Further, the advantage of this method is standardization: if you let each person enter herr data individually, one student may say "Pennsylvania State University" as her school, while another says "Penn State." If you have a menu, then you can ensure that both people are recorded as going to the same school. One way that we have used for combating the problem of overstuffing and/or bogus data in the tables is to require that more than one person have the item in his profile in order for it to be displayed in the menu. This is tricky of course, but we'll see how it's done later.

3.) Lastly, there are "editable-private" tables, such as 'student_skill' or 'student_activity', which are editable by the user, but each person has his or her own set of records, which only she can edit.

Hopefully soon we will be able to put a diagram of the database structure; a visual representation is sometimes clearest.

hits since May 16, 2001.