![]() |
![]() |
[Group E Home] | [IS206
Home] | [Milestone
6 Top]
Scalability Needs for GWDGlobal Web Developer, as an application tailored to facilitate the web development of various sizes of organizations, can have installations which range from an organization developing a small community center website to a large-scale corporate website composed of smaller departmental subsites. GWD has been designed with the scalability needs of various types of organizations in mind. For organizations with a small website, GWD can adequately run from a single host. In many cases, scalability for a single host may be a matter as simple as upgrading to a better server with multiple processors and more RAM. For larger-scale website development, however, the use of GWD can be scaled and optimized by partitioning across multiple hosts. GWD has been designed to enable this kind of partitioning. Taxed processes can be restored to desired performance levels by adding additional hosts into the application server group, effectively scaling to match increased use. Modularization for partitioning Many of GWD's functions have been modularized in such a way that different features can be partitioned onto different hosts. These functions have been split into 12 different atomic processes, listed in the table given below. Although each module can potentially be installed on separate hosts, in practice scalability is attained more appropriately by clustering similar atomic functions on the same host. As necessary, the more resource-intensive processes, such as the collaborative environment, can have entire hosts devoted to their processing. This table lists each feature and the database associated with that module. The "Group" letters on the right of the table are clustering suggestions for the different modules. These are only suggestions based on the closeness in function between the different processes. Some of these groups or individual processes may require further partitioning onto multiple hosts, depending on their degree of use (please refer to the section on congestion). The twelve basic modules are as follows: |
Feature | Database | Group |
Access Control/Permission | GWD User | A |
Styleguide & Template Management; Quality Control | GWD Website; Template | A |
User Email Alert; Access Control/Permission | Visitor | B |
(Separate location for public pages) | Public Website | B |
Annotation | Annotation | C |
Real Time Collaboration | Real Time | C |
Replication/ Reconciliation | Reconciliation | D |
Version Control | Version Control | D |
Labor Tracking | Site Status | E |
Link Management | Link Management | E |
Traffic-Based Link Suggestion; Content-Based Link Suggestion | Suggestion | E |
Backup | Backup | F |
Characteristics for each group:
Partitioning in terms of scalability As mentioned above, there are individual modules based around specific functions and these modules are grouped together into clusters. Diagram: GWD Application Logic Architecture (To get a more detailed view of where this GWD application logic diagram resides, please refer to the section about "Host-level Architecture for GWD". ) There are three reasons behind the modularization of features with feature-specific databases and their groupings:
Even though processes across multiple hosts may be required for some transactions, such as those initiating from the Group A cluster and redirected to the other processes, the majority of usage is directed toward the use of one module. Taking the "real-time collaboration" process as an example, consider that organizations which require intensely interactive web development will heavily take advantage of the collaboration features. In these organizations, the Group C cluster may have to broken up to devote a single or several fulltime servers to the "real-time collaboration" module. This would improve the performance of both the collaboration and the annotation. Looking closely at individual modules, modules themselves also have the ability to be partitioned across multiple hosts. One of the methods, partitioning the "real-time collaboration" module across multiple hosts, is outlined in the section on concurrency. This example highlights the value of directing all collaboration requests to one of many hosts so that the load is balanced and evenly distributed. Partitioning of modules is not limited to "real-time collaboration." In the case of an organization focusing on publishing web-based books, heavy use might be centered more on the "replication/reconciliation" feature and thus require the partitioning of this module across multiple hosts while opting to keep the other clusters (A, B, C, D's "version control" module , and E) on a single host. Along this line, scalability is not only limited to the processes but also applies to the databases used by the modules. Because GWD will be heavily used in some organizations, these databases may become filled with a substantial amount of data. The amount of data stored in the databases can slow down the performance of tasks. Therefore, these databases can also be partitioned according to the website subsection or other set identifier to allow incoming processes to a database to be directed to the proper partition.
What questions must scalability address? The main benefit for having different methods of partitioning is the ability to customize GWD installations to specific organizations. For each organization, the use of GWD, the scalability needs, and the monetary resources will differ. A few questions must be answered to inform the scaling of GWD:
The purpose of scalability is to optimize performance. The addition of mobile code as a means of distributing application logic to the client side may, in some cases, be another scalability option to increase performance without the need of adding hardware. A diagram with mobile code as part of GWD can be seen in the architecture section.
|