Mysql Redo log 意外损坏问题的恢复过程

其实这是一个意外的情况,问题也不太复杂,过程是这样的,在max os x中使用 vim工具打开log file 文件,然后转换成16进制显示,原来的目的是仅仅转换为16进制的显示格式,但是却将vim将文件彻底转换为了16进制文件,如果此时数据库比如kill掉,那么正常的启动是不能进行的;下面来重演一下今天遇到的问题:

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.06 sec)

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| test           |
+----------------+
1 row in set (0.00 sec)

mysql> desc test;
+-------+--------------+------+-----+---------+-------+
| Field | Type         | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| id    | int(11)      | YES  |     | NULL    |       |
| name  | varchar(200) | YES  |     | NULL    |       |
+-------+--------------+------+-----+---------+-------+
2 rows in set (0.01 sec)

mysql> insert into test values (12,'jjjj');
Query OK, 1 row affected (0.01 sec)
mysql> select * from test;
+------+--------+
| id   | name   |
+------+--------+
|    1 | jjjjjj |
|   12 | jjjj   |
+------+--------+
2 rows in set (0.00 sec)
mysql> exit;

这里使用vim打开redo 文件:

redo12

回车后,可以显示为16进制格式:

redo18

此时的显示是没有问题的,问题是直接重新保存了;此时如果启动数据库无法启动的;

2016-10-18T09:37:51.272392Z 1 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2016-10-18T09:37:51.292530Z 1 [Note] InnoDB: Completed initialization of buffer pool
2016-10-18T09:37:51.388325Z 1 [ERROR] InnoDB: Log file ./ib_logfile0 size 50336847 is not a multiple of innodb_page_size
2016-10-18T09:37:51.388380Z 1 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2016-10-18T09:37:51.701317Z 1 [ERROR] Failed to initialize DD Storage Engine
2016-10-18T09:37:51.701439Z 0 [ERROR] Data Dictionary initialization failed.
2016-10-18T09:37:51.701458Z 0 [ERROR] Aborting
2016-10-18T09:37:51.701478Z 0 [Note] Binlog end
2016-10-18T09:37:51.701506Z 0 [Note] Shutting down plugin 'MyISAM'
2016-10-18T09:37:51.701525Z 0 [Note] Shutting down plugin 'InnoDB'
2016-10-18T09:37:51.701533Z 0 [Note] Shutting down plugin 'CSV'
2016-10-18T09:37:51.701857Z 0 [Note] /Volumes/ssd/hadoop/soft/mysql-8/mysql8/bin/mysqld: Shutdown complete

再转换回来呢?结果是不行的:

redo-19

此时查看文件大小:

redo20

两个文件的大小是不同,所以即使转换回来,也不能与原来的问题完全相同了,所以启动数据库会依旧报相同的错误,那么怎么办呢?实际不复杂,直接删除 原来的日志文件,然后启动数据库:

yc:data yc$ ../mysqlstart start
Starting MySQL
.... SUCCESS! 

成功启动;这里延伸一下,删除redo log后直接启动,数据库会自动重新创建redo 文件;那么问题来了,明明是写入了数据,然后kill掉数据库的,那启动数据库时是需要进行恢复过程的,此时就会用到logfile0 而此时的数据文件是重建的,理论上是无法恢复的,那数据库也就不应该能够打开,可此时打开了那问题出在哪里呢?这里是执行的 kill xxxxx,如果是这样,mysql执行的操作是:

2016-10-19T01:24:38.326833Z 0 [Note] Giving 0 client threads a chance to die gracefully
2016-10-19T01:24:38.326869Z 0 [Note] Shutting down slave threads
2016-10-19T01:24:38.326889Z 0 [Note] Forcefully disconnecting 0 remaining clients
2016-10-19T01:24:38.326903Z 0 [Note] Event Scheduler: Purging the queue. 0 events
2016-10-19T01:24:38.327021Z 0 [Note] InnoDB: FTS optimize thread exiting.
2016-10-19T01:24:38.436183Z 0 [Note] Binlog end
2016-10-19T01:24:38.437720Z 0 [Note] Shutting down plugin 'ngram'
2016-10-19T01:24:38.437751Z 0 [Note] Shutting down plugin 'BLACKHOLE'
2016-10-19T01:24:38.437758Z 0 [Note] Shutting down plugin 'ARCHIVE'
2016-10-19T01:24:38.437763Z 0 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2016-10-19T01:24:38.437793Z 0 [Note] Shutting down plugin 'MRG_MYISAM'
2016-10-19T01:24:38.437804Z 0 [Note] Shutting down plugin 'MyISAM'
2016-10-19T01:24:38.437827Z 0 [Note] Shutting down plugin 'INNODB_CACHED_INDEXES'
2016-10-19T01:24:38.437832Z 0 [Note] Shutting down plugin 'INNODB_SYS_VIRTUAL'
2016-10-19T01:24:38.437837Z 0 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'
2016-10-19T01:24:38.437840Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESPACES'
2016-10-19T01:24:38.437844Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN_COLS'
2016-10-19T01:24:38.437849Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN'
2016-10-19T01:24:38.437852Z 0 [Note] Shutting down plugin 'INNODB_SYS_FIELDS'
2016-10-19T01:24:38.437856Z 0 [Note] Shutting down plugin 'INNODB_SYS_COLUMNS'
2016-10-19T01:24:38.437860Z 0 [Note] Shutting down plugin 'INNODB_SYS_INDEXES'
2016-10-19T01:24:38.437864Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESTATS'
2016-10-19T01:24:38.437868Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLES'
2016-10-19T01:24:38.437872Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_TABLE'
2016-10-19T01:24:38.437875Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_CACHE'
2016-10-19T01:24:38.437879Z 0 [Note] Shutting down plugin 'INNODB_FT_CONFIG'
2016-10-19T01:24:38.437883Z 0 [Note] Shutting down plugin 'INNODB_FT_BEING_DELETED'
2016-10-19T01:24:38.437887Z 0 [Note] Shutting down plugin 'INNODB_FT_DELETED'
2016-10-19T01:24:38.437891Z 0 [Note] Shutting down plugin 'INNODB_FT_DEFAULT_STOPWORD'
2016-10-19T01:24:38.437895Z 0 [Note] Shutting down plugin 'INNODB_METRICS'
2016-10-19T01:24:38.437898Z 0 [Note] Shutting down plugin 'INNODB_TEMP_TABLE_INFO'
2016-10-19T01:24:38.437902Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_POOL_STATS'
2016-10-19T01:24:38.437906Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE_LRU'
2016-10-19T01:24:38.437910Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE'
2016-10-19T01:24:38.437914Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX_RESET'
2016-10-19T01:24:38.437918Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX'
2016-10-19T01:24:38.437922Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM_RESET'
2016-10-19T01:24:38.437926Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM'
2016-10-19T01:24:38.437930Z 0 [Note] Shutting down plugin 'INNODB_CMP_RESET'
2016-10-19T01:24:38.437934Z 0 [Note] Shutting down plugin 'INNODB_CMP'
2016-10-19T01:24:38.437938Z 0 [Note] Shutting down plugin 'INNODB_LOCK_WAITS'
2016-10-19T01:24:38.437942Z 0 [Note] Shutting down plugin 'INNODB_LOCKS'
2016-10-19T01:24:38.437945Z 0 [Note] Shutting down plugin 'INNODB_TRX'
2016-10-19T01:24:38.437949Z 0 [Note] Shutting down plugin 'InnoDB'

所以此时是不会发生,恢复操作的;那么如果直接是 kill -9 呢?

2016-10-19T01:33:37.038483Z 0 [Warning] option 'innodb-buffer-pool-dump-pct': unsigned value 0 adjusted to 1
2016-10-19T01:33:37.038987Z 0 [Note] Plugin 'FEDERATED' is disabled.
2016-10-19T01:33:37.039720Z 1 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-10-19T01:33:37.039732Z 1 [Note] InnoDB: Uses event mutexes
2016-10-19T01:33:37.039737Z 1 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2016-10-19T01:33:37.039741Z 1 [Note] InnoDB: Compressed tables use zlib 1.2.3
2016-10-19T01:33:37.039980Z 1 [Note] InnoDB: Number of pools: 1
2016-10-19T01:33:37.040088Z 1 [Note] InnoDB: Using CPU crc32 instructions
2016-10-19T01:33:37.041305Z 1 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2016-10-19T01:33:37.051868Z 1 [Note] InnoDB: Completed initialization of buffer pool
2016-10-19T01:33:37.071735Z 1 [Note] InnoDB: Log scan progressed past the checkpoint lsn 11088465
2016-10-19T01:33:37.071769Z 1 [Note] InnoDB: Doing recovery: scanned up to log sequence number 11088474
2016-10-19T01:33:37.071999Z 1 [Note] InnoDB: Doing recovery: scanned up to log sequence number 11088474
2016-10-19T01:33:37.072017Z 1 [Note] InnoDB: Database was not shutdown normally!
2016-10-19T01:33:37.072025Z 1 [Note] InnoDB: Starting crash recovery.
2016-10-19T01:33:37.181203Z 1 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2016-10-19T01:33:37.181233Z 1 [Note] InnoDB: Creating shared tablespace for temporary tables
2016-10-19T01:33:37.181999Z 1 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...

这时是有恢复过程的;那么如果此时,redo 日志又出现问题呢?

2016-10-19T01:59:50.861261Z 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html for information about forcing recovery.
2016-10-19T01:59:50.861533Z 0 [ERROR] InnoDB: Page [page id: space=30, page number=5] log sequence number 11137168 is in the future! Current system log sequence number 11081880.

此时报错数据库仍然是能够启动的; 关于恢复的过程这里不再多说,将另行总结;

Mysql Redo Log 解析

本文所用环境: mac osx 10.10.5 (14F1808);

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.0-dmr |
+-----------+
1 row in set (0.01 sec)

首先看看在此版本中的 redo log 的默认配置信息:

mysql> show variables like '%innodb_log%';
+-----------------------------+----------+
| Variable_name               | Value    |
+-----------------------------+----------+
| innodb_log_buffer_size      | 16777216 |
| innodb_log_checksums        | ON       |
| innodb_log_compressed_pages | ON       |
| innodb_log_file_size        | 50331648 |
| innodb_log_files_in_group   | 2        |
| innodb_log_group_home_dir   | ./       |
| innodb_log_write_ahead_size | 8192     |
+-----------------------------+----------+
7 rows in set (0.00 sec)

Innodb中默认redo log的大小为 48M,2个,默认位于data 目录下;大小,位置,都可以更改;当数据库的数据发生变化时,其变化都会写入redo log,即 acid 中的d的实现;redo log 生成的大致过程是:

 user session –》transaction –> transaction cache –> log buffer –> redo log 

通过os的命令可以看到:

yc:data yc$ ls -al ib_log*
-rwxr-xr-x@ 1 yc  staff  50331648 Oct 17 10:39 ib_logfile0
-rwxr-xr-x@ 1 yc  staff  50331648 Sep 19 13:46 ib_logfile1

redo 日志不是同时使用的,那么它是怎么记录相关日志的呢? redo1

 

 

rdo log 在使用中是循环使用的,如上图所示,当一个日志文件写满时,会使用第二个,然后依次循环重复使用;需要注意的是,根据redo log 的大小,innodb 会计算checkpoint age 的大小与 redo log 的比例,而执行不同的io操作

logflush

如上图示;

redo log 是由 log block 组成的,log block 的大小一般为512 byte (os_file_log_block_size),redo log 主要有三个部分组成: file header(文件的header),checkpoint,log block; 在这里要说一下这里的 file header 是有一个逻辑概念,它是由4个 log block组成的,其大小为 2048 byte,其中它包含两个checkpoint ;在8.0 的版本当中,具体的结构发生了很大的变化,这个版本的被称为 version 2,在5.7.9 中的版本为1,下面看一下具体的结构:

log file header:

redo5

其中新版本的 checkpoint的结构为:

redo6

其长度比原来的版本也小了不少,结构也更简洁;那么查看一个具体的redo 日志文件:

0000000: 0000 0002 0000 0000 0000 0000 0000 2200  ..............".
0000010: 4d79 5351 4c20 382e 302e 3000 0000 0000  MySQL 8.0.0.....
0000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................

从上面的具体数据可以看出是与数据结构一一对应的;下面再看看 redo log block 的结构:

redo7

以上部分是 block header 其后是再是存储的具体 redo record;到这里redo 的大致结构就是这些,如果再深入可以把日志的写入,数据库恢复,以及redo 解析结合起来;

Sys schema

注:本文档测试环境为你8.0 版本

sys schema 在之前的 版本中又称之为 ps_helper;随着 版本的升级sys schema 的功能不断得到加强,:

mysql> show tables;
+-----------------------------------------------+
| Tables_in_sys                                 |
+-----------------------------------------------+
| host_summary                                  |
| host_summary_by_file_io                       |
| host_summary_by_file_io_type                  |
  ............
  .................
| x$wait_classes_global_by_avg_latency          |
| x$wait_classes_global_by_latency              |
| x$waits_by_host_by_latency                    |
| x$waits_by_user_by_latency                    |
| x$waits_global_by_latency                     |
+-----------------------------------------------+
101 rows in set (0.01 sec)

看到这里时,用过oracle 童鞋们肯定为之一振,x$好熟悉的面孔,看来mysql官方版本越来越多oracle db的血缘了;不过不要被表面现象迷惑呀,这些当中只有一个表:sys_config,其余的都是视图、存储过程、函数;其底层的数据基本来源于 performance_schema、information_schema ;如果sys schema 没有被安装,那么通过以下方式可以安装:

$ git clone https://github.com/MarkLeith/mysql-sys.git /tmp/sys

• $ cd /tmp/sys

• $ mysql -u user -p < sys_<version>.sql

检查sys 是否被加载:

mysql> select * from version;
+-------------+---------------+
| sys_version | mysql_version |
+-------------+---------------+
| 1.5.1       | 8.0.0-dmr     |
+-------------+---------------+
1 row in set (0.01 sec)
mysql> select * from schema_object_overview where db='sys';
+------+---------------+-------+
| db   | object_type   | count |
+------+---------------+-------+
| sys  | BASE TABLE    |     1 |
| sys  | FUNCTION      |    22 |
| sys  | INDEX (BTREE) |     1 |
| sys  | PROCEDURE     |    26 |
| sys  | TRIGGER       |     2 |
| sys  | VIEW          |   100 |
+------+---------------+-------+
6 rows in set (0.31 sec)

主要的视图可以分为以下几类: 

  • user/host 汇总视图
mysql> show tables like '%summary%';
+-------------------------------------+
| Tables_in_sys (%summary%)           |
+-------------------------------------+
| host_summary                        |
| host_summary_by_file_io             |
| host_summary_by_file_io_type        |
| host_summary_by_stages              |
| host_summary_by_statement_latency   |
| host_summary_by_statement_type      |
| user_summary                        |
| user_summary_by_file_io             |
| user_summary_by_file_io_type        |
| user_summary_by_stages              |
| user_summary_by_statement_latency   |
| user_summary_by_statement_type      |
| x$host_summary                      |
| x$host_summary_by_file_io           |
| x$host_summary_by_file_io_type      |
| x$host_summary_by_stages            |
| x$host_summary_by_statement_latency |
| x$host_summary_by_statement_type    |
| x$user_summary                      |
| x$user_summary_by_file_io           |
| x$user_summary_by_file_io_type      |
| x$user_summary_by_stages            |
| x$user_summary_by_statement_latency |
| x$user_summary_by_statement_type    |
+-------------------------------------+
24 rows in set (0.00 sec)
  • IO  概要视图
mysql> show tables like 'io_%';
+------------------------------+
| Tables_in_sys (io_%)         |
+------------------------------+
| io_by_thread_by_latency      |
| io_global_by_file_by_bytes   |
| io_global_by_file_by_latency |
| io_global_by_wait_by_bytes   |
| io_global_by_wait_by_latency |
+------------------------------+
5 rows in set (0.00 sec)
  • Schema 分析视图
mysql> show tables like 'schema_%';
+-------------------------------------+
| Tables_in_sys (schema_%)            |
+-------------------------------------+
| schema_auto_increment_columns       |
| schema_index_statistics             |
| schema_object_overview              |
| schema_redundant_indexes            |
| schema_table_lock_waits             |
| schema_table_statistics             |
| schema_table_statistics_with_buffer |
| schema_tables_with_full_table_scans |
| schema_unused_indexes               |
+-------------------------------------+
9 rows in set (0.00 sec)
  • 等待分析视图
mysql> show tables like 'wait_%';
+------------------------------------+
| Tables_in_sys (wait_%)             |
+------------------------------------+
| wait_classes_global_by_avg_latency |
| wait_classes_global_by_latency     |
| waits_by_host_by_latency           |
| waits_by_user_by_latency           |
| waits_global_by_latency            |
+------------------------------------+
5 rows in set (0.01 sec)
  • Statement Analysis Views
mysql> show tables like 'state%';
+---------------------------------------------+
| Tables_in_sys (state%)                      |
+---------------------------------------------+
| statement_analysis                          |
| statements_with_errors_or_warnings          |
| statements_with_full_table_scans            |
| statements_with_runtimes_in_95th_percentile |
| statements_with_sorting                     |
| statements_with_temp_tables                 |
+---------------------------------------------+
6 rows in set (0.01 sec)
在存储过程当中,ps_setup_xxxxxxxx,可以用来查询 ps 相关的设置情况:

ps1

除此以外还有2个存储过程需要关注:

  • ps_trace_statement_digest

其主要的功能为:

ps_trace_statement_digest() analyses live traffic looking for certain statement digest for a period of time;Captures statistics on each matching statement it finds • Returns a report of the captured stats;An overall summary;A break down for the longest running example • An EXPLAIN (if the statement is not truncated)

其参数为:

       in_digest                 The statement digest to analyse
       in_runtime               How long to run analysis for
       in_interval               How often to snapshot data
       in_start_fresh         Whether to truncate P_S tables first
       in_auto_enable      Whether to auto enable required config

call analyze_statement_digest('6134e9d6f25eb8e6cddf11f6938f202a', 60, 1, true, true);

注意这里这个“6134e9d6f25eb8e6cddf11f6938f202a” ,这个值来源于statement_analysis 当中;

  • ps_trace_thread

用来监控某个线程的活动情况,并且可以生成‘dot’格式的日志文件,从而可以通过 dot 语言生成层次图;

如果想更深入的了解 sys schema的知识,或者对某些参数等不清楚,直接查看 相关视图、存储过程等的源码即可,这是最准确的,只是可能有些内容也需要自己推敲消化下;

Myql Thread Pool

注:thread —线程

在连接数据库时为了能够节省数据库的资源,提高性能,保障稳定性,一个常用的方法是在代码当中使用 连接池,连接池的大致原理如下下图所示:

tps3

连接池保持与数据库的连接,每个连接在mysql当中是一个对应的线程;当有请求进来时会被分配到一个空闲的连接上,如果是没有连接可用,那么超时后会有报错;在最近的mysql 版本中引入了 thread pool的功能,它是由mysql 端提供的一种方法,不需要开发编写代码;其原理如下图所示:

tps4

这种方式是由数据库提供线程池,并且分组,具体的请求由组当中的线程执行;在mysql当中,真正执行各种操作的是对应的线程,当一个client 连接到server时,就会对应的产生一个线程,此线程执行相应的操作;但是线程一般是在某个cpu上执行,在目前的版本当中,多核cpu 无法提高单线程的能力;但是oracle可以将一个操作再次分解为不同的进程去并发执行;mysql 线程的并发量是由参数:innodb_thread_concurrency控制的,适当的设置可以免于资源竞争导致的事故。需要注意的是在mysql中的线程池是由单独的插件完成的;而在maridb和percona中是需要编译才能生效的;两者各有好处;下来看看 线程池的工作过程:

thp1

线程池是由线程组,构成的,默认有16个线程组,每个线程组最多可以有4096个线程。初始时,每个组只有一个线程,作为监听,当新的连接请求时,连接管理器会将请求以roud robin的方式分发多个线程组中;请求被分配到线程组后会出现以下几种情况:

tps5

1、当有新到请求时,如果监听线程不忙,并且没有其它正在执行的操作,并且能够很快执行完成操作,那么就不用产生新的监听线程,由监听完成操作;

2、如果监听忙,那么会首先放入等待队列,这么做的原因是减少短操作创建的线程数,节省资源

3、如果操作参数的设置值,就会产生一个新的线程作为监听继续执行请求;

这里在等待队列是有区别的,low priority queue主要包含:

  • all statements for non-transactional storage engines
  • all statements if autocommit is enabled
  • the first statement in  an InnoDB transaction

而 high priority queue主要包含:

  • any subsequent statements in InnoDB transactions, and
  • any statements kicked up from the low priority queue.

需要主要的是这两个队列之间是存在转换关系的,当连接请求时间超过thread_pool_prio_kickup_timer 的设置值时,低优先级队列当中的请求会转移到高优先级队列当中;如果高优先级队列是空的,此时监听会执行低优先级队列中的用户请求;如果当前组的高优先级队列非空,那么监听会先执行高级别队列当中的用户请求,然后再执行低级别队列当中的用户请求;

在percona的版本当中还对优先队列进行进一步改进,可以参考:

https://www.percona.com/blog/2014/01/29/percona-server-thread-pool-improvements/

其它参考:http://www.programering.com/a/MTO5YDMwATI.html

视频直播在线聊天实践

现在是直播的风口,作为一个技术人员,笔者更关注一些实用技术,比如每个直播窗口都有的在线即时聊天功能,这个功能非常有用,尤其是对于活跃气氛十分重要,也是促进营收的一个重要促进因素;下面聊聊笔记参与的一个技术实践;小公司没有高大上的技术,所以选用的都是一些常用的技术,但是对于一般的用户量应该是够了;在web 页面中的即时通讯技术方面选用了 nodejs socket.io ,那么问题来了:nodejs1

nodejs 是一个单进程的架构,如果客户端直接连接到 socket.io server,那么单个nodejs 的进程,笔者测试 6000 并发就会出问题;那么怎么办,启动多个node进程:

nodejs2

如果启用多个nodejs 进程,那么客户端只能连接一个,此时的问题是无法高可用,客户端之间不能通讯,那么如果客户端连接不同的server呢?这也不太合理,如果是1w用户,10个sever,那么就要产生10万的连接显然是不行的;那么来看第三种方式:

nodejs3

将消息服务端分为 server与node ,server对接客户端;而每个node 同时连接server,这样当任意一条消息发送到一个node时,可以发送到每个server,而server又会发送到其它的节点;这样既可以实现客户端的通讯,有可以实现高可用,任意node或者server 挂掉,都不影响系统的使用,将配置信息参数文件化,这样就可以实现水平扩展提高扩展能力;在实际使用中10w并发毫无压力;