![]() |
Programming Project
by JAYAT Adrien and JACQUET Ysée (2023)
|
This project was realised by Adrien JAYAT and Ysée JACQUET.
It consists of a recommendation system for movies based on the Netflix Prize dataset published by Netflix in 2006.
Our algorithm gives a RMSE of 0.971 on the test set. In comparison, the RMSE of the Cinematch, the Netflix algorithm, was 0.9525.
This project is using two submodules:
First, clone the project and its submodules:
Install make, gcc, doxygen and Zstandard compression algorithm:
Build the project with the make command. It will generate the main executable in the current working directory.
You can run all tests with the make tests command:
After building the project, you can run the ./main executable to start the program. Use the option -h to get the following options list in your terminal.
| Flag | Argument | Description |
|---|---|---|
-f | FORCE | Force to recompute all stats. |
-r | LIKES_FILE | List of movies liked by the user. |
-n | N | Length of the recommendation list the algorithm will give. |
-d | DIRECTORY | The path of the folder where files corresponding to results will be saved. |
-l | LIMIT | Forbidden to take in acount ratings with a date greater than the LIMIT. |
-s | MOVIE_ID | Give statistics about the movie with the identifier MOVIE_ID. |
-c | IDS | Allow to take into account only the ratings of the customers with given identifiers. |
| ∅ | NB_CUSTOMER_IDS | Number of given customer ids. |
-b | IDS | Allow to not take into account the ratings of the customers with given identifiers. |
| ∅ | NB_BAD_REVIEWERS | Number of given bad reviewers. |
-e | MIN | Allow to take into account only customers who rated at least MIN movies. |
-t | TIME | Precise the executive time of the algorithme. |
-p | PERCENT | Percentage (between 0 and 1) to quantify the importance of personnal recommendations over popular recommendations. |
Note that options -r, -n, -t and -p are not used for statistics processing.
The likes.txt file contains titles of movies liked by the user. You can also give a list of movie ids directly from the command line.
Add the -p option to give a percentage to quantify the importance of personalized recommendations over popularity.
It will create a file named stats_mv_000042.csv in the stats folder, containing the min, max and average score of the movie 42.