Антон Шевчук // Web-разработчик

PHP CookBook: MyPHPTube.com (YouTube clone) // Estimation

MyPHPTube

With the increased popularity of the YouTube.com service many people wanted to organize a similar service, but don’t know how to do this. If you are one of them – this article is for you

Features

First of all describe the basic site’s functions (and also determine and the role of users) :

  1. Guest
    • View video online
  2. Member
    • Upload videos
  3. Administrator
    • Manage users
    • Manage files

WEB 2.0 Features

We need to extend base functions, because we need to get more traffic:

  1. Guest
    • view comments on video
    • search for files (using categories or tags)
  2. Member
    • comment on videos
    • record video from Web (using webcam)
    • assign category to video
    • assign tag to video
    • rate video
    • organize list of friends
    • manage favorites
    • send in-mails
  3. Administrator
    • Manage comments
    • Manage categories

These are certainly not all the features of YouTube, but at least something to start with

Server Configuration

Our system will be based on LAMP:

  • Linux
  • Apache (version 2.2 and higher)
  • MySQL (version 5.0 and higher)
  • PHP (version 5.2 and higher)

Troubleshooting

And now describe the problems that you may encounter

Video Conversion

To gain user attention we need to allow them uploading videos in any format, but it’s more comfortable to users to get videos in one specific format (this will allow not to have a codecs-zoo on your machine and on the other hand if site requires some extra actions from user – it is very likely that user will forget your site – you know we are lazy. And what can be done with this? The answer is obvious – convert video files into a single format, and the format should be FLV. Flash-video can be viewed on most operating systems because it uses the widespread Adobe Flash Player software preinstalled on most browsers, and this format is also supported by many programs for video playback, such as MS MPlayer and almost all other players.

We need the following software tools (all opensource):

  1. mencoder or FFmpeg
  2. flvtool2 (need Ruby)
  3. PHP Program Package (‘example.php’ contains example using mencoder) or ffmpeg class (‘ffmpeg.example1.php’ uses ffmpeg)

You can use mplayer to get additional information about original video file.

If we properly assembled and installed, we can now convert…

Upload

People did not like to sit and wait while the file is silently downloaded to a server, the users are curious, so we would have to show progress bar:

All references are from xajax forum http://community.xajaxproject.org/viewtopic.php?pid=10100

Resources

ТIf we will use the same server for webste and video conversion then website will be more dead than alive. For this purpose we will have to use another (one at least) server that will take original files from the web and put it back converted in a while. In other words we need more than 1 server to create a working solution.

Load distribution

If we store all data in one server, it will not carry the load. We must have several servers to store data :

You must decide themselves how to distribute the load. You have to estimate the amount of data to be stored and then choose appropriate scheme, (let’s assume that one user uploads to a server 20mb per month (234Gb for 1000 users per year), and not popular files are not stored more than a year):

  • Stored data ~ 0.3Tb – 1.5Tb:we keep all the videos on each mirror. We have main mirror – server on which converted video always appears first and all other mirrors sync with it
  • Stored data ~ 1.5Tb – 3 Tb:
    All videos are stored in one central server, if the video is growing in popularity, it is poured into other mirrors
  • Stored data > 3Tb:
    Video is uploaded to the nearest mirror (example : assume that the video uploaded from China will be mostly popular in China, thus place it to the mirror in China) With the growing popularity of this video we will copy it to the server closest to the epicentre of popularity (example : Chinese living in the United States, the video file upload to a mirror located in the United States, watched videos in China, in this situation video file will be copied to the Chinese server)

This is not statistical numbers, they are for illustrative purposes. China cited as an example (nothing personal). Write an opinion in the comments…

DataBase

Next, I will describe simple architecture DB:

users
id autoincrement field
login unique login
password encrypt password
email user email
actcode activation code
role ENUM(guest/user/admin)
status not active / active / disable
date_create
date_update date of last change profile
date_login date of last login
another fields e.g. first name, last name
friends
id autoincrement field
user_id1 user ID
user_id2 user ID
status request / ok / cancel
date_create date of send request
date_update date of accept or denied request
files
id autoincrement field
title title of video file
file name of file on file system
status not convert / in process / ok
access public / members only / friends only / private
author_id ID of owner (users)
category_id ID of category (categories)
date_create
date_update date of last changes
another fields e.g. length, description
mirrors
id autoincrement field
url mirror url
date_create
date_update date of last changes
mirrors_link
file_id ID of file (files)
mirror_id ID of mirror (mirrors)
status current file status downloading / ok
date_create
date_update date of last changes
categories
id autoincrement field
pid parent category ID
name name of category
another fields e.g. metadescription, metakeywords

tags
id autoincrement field
word tag word
tags_link
id autoincrement field
tag_id tag ID (tags)
file_id file ID (files)
comments
id autoincrement field
author_id ID of owner (users)
file_id file ID (files)
message text of message
date_create

rate
id autoincrement field
author_id ID of owner (users)
file_id file ID (files)
rate integer value, e.g. for 0 to 10
date_create

messages
id autoincrement field
author_id ID of owner (users)
user_id ID of recipient (users)
type e.g. friend request-response / admin message
author_folder outbox/draft/delete
user_folder inbox/delete
user_status read or not
date_create

bookmarks
id autoincrement field
author_id ID of owner (users)
file_id file ID (files)
title title of link
description some description
date_create

A small note:

  • For the columns date_create and date_update using a gmdate (’Y-m-d H:i:s’) – GMT, it will make the life easier in the future when displaying the time on the site

Team

How to gather a team to develop such a project best? I propose the following:

  • 2 PHP-Developer
  • Flash Developer / Designer
  • 1 UNIX-administrator
  • 1 Tester
  • 1 Manager

Estimation

Guest
Static pages pages e.g. “Contact Us”, “Terms of Use” etc. 1h/page
Search file simple search by several params 6h
Tags Cloud 8h
View video FLV video player 16h
View comments 6h
Registration Registration and activation via e-mail 12h
Forgot password 2h
User
Login/Logout 2h
Upload file 14h
Record video Requried media server:

FMS, Wowza (from Feb 2007) or Red5 (opensource)
16h
Progress bar 16h
Send Comment 4h
Rate File 2h
Bookmarks management create/edit/delete 8h
Admin
Users management The names and details of members who have been registered are listed showing the date their account was created and other user info.
Options are available for Administrator: view Member details, ban Members, search for existing users by First or Last name or Username etc.
16h
Categories management 16h
Others
Design 32h
Database Design 16h
Project Architecture Design 32h
Organize File Storage From 8h to 96h 8h
Convert process 20h
Total
Environment setup configure web-server, convert-server, mirrors 40h
Development 256h
Testing 30%-50% of all development 85h
Management 10% of all time 40h
Total: 421h

So we have 421 hours, or approximately 2.5 months of development. Did you expect such a number? I thought it will be more :)

That’s very optimistic estimation that includes several assumptions:

  • the developers will use CMF system similar to phpXCore or Zend Framework
  • the developers are familiar with the CMF choosen (i.e. won’t learn in while coding)
  • the simplest file storage will be used (all mirrors store all files)

If development will be done from scratch you can easily multiply this assessment by 2.
Total, the project will cost at least $ 10,000…

P.S.

The main problem is not in the system, main problem is drawing audience. Who will use your service if there is YouTube (and even PornoTube)? If you have ideas, write in comments…

At the time of writing this article MyPHPTube.com domain was not registered. If you register it, send a beer at my home address… ;)