é¨èä¿¡æ¯æè¡çé£éç¼å±ï¼è¨ç®æ©çµæåçææç¤ºç硬件åºç¤æ£èç³»çµ±æ¶æ§åè»ä»¶éç¼å¯¦è¸æ·±åº¦èåï¼å ±åå¡é èæªä¾è¨ç®çæ°æ ¼å±ãè¨ç®æ©ç³»çµ±çç¼å±è¶¨å¢å¹¶éå¤ç«æ¼é²ï¼èæ¯æ·±å»å½±é¿èè»ä»¶çéç¼çå¿µãæ¹æ³èå·¥å ·ï¼æ¨åèæ´åä¿¡æ¯ç¢æ¥ååéé²ã
ä¸ãè¨ç®æ©ç³»çµ±ç¼å±çæ ¸å¿è¶¨å¢
ç¶åï¼è¨ç®æ©ç³»çµ±çç¼å±åç¾åºä»¥ä¸å¹¾å顯èä¸ç¸äºéè¯ç趨å¢ï¼
- ç°æ§è¨ç®èå°ç¨æ¶æ§çå´èµ·ï¼å³çµ±ç以CPUçºä¸å¿çéç¨è¨ç®æ¨¡å¼ï¼æ£é漸è®ä½äºç±CPUãGPUãFPGAãAIå éå¨ï¼å¦NPUãTPUï¼ççµæçç°æ§è¨ç®ç³»çµ±ãéç¨®è¶¨å¢æºäºå°æ´é«è½ææ¯åç¹å®è¨ç®ä»»åï¼å¦å形渲æã深度å¸ç¿ãç§å¸è¨ç®ï¼æ§è½ç極è´è¿½æ±ãä¾å¦ï¼èæçMç³»åè¯çãè±åéçGrace Hopperè¶ ç´è¯çï¼é½éæäºå¤ç¨®è¨ç®å®å ï¼å½¢æäºçä¸ç³»çµ±ï¼SoCï¼ãéå°è»ä»¶éç¼æåºäºææ°ï¼éç¼è éè¦çè§£ä¸åè¨ç®å®å çç¹æ§ï¼å¹¶å©ç¨å¦CUDAãOpenCLãSYCLçç°æ§ç·¨ç¨æ¡æ¶ï¼æè½å åéæ¾ç¡¬ä»¶æ½åã
- åå²èè¨ç®ç深度èåï¼é¦®Â·è«¾ä¾æ¼æ¶æ§ä¸âåå²å¢»âåé¡ï¼å §å帶寬åå»¶é²æçºæ§è½ç¶é ¸ï¼æ¥ççªåºãçºçªç ´æ¤éå¶ï¼è¿åè¨ç®ï¼Processing-in-Memory, PIMï¼ååç®ä¸é«ï¼Computing-in-Memoryï¼çæ°æ¶æ§æ£å¨å¾ç ç©¶èµ°åæç¨ãéäºæ¶æ§å°é¨åè¨ç®åè½åµå ¥åå²å®å å §é¨ï¼æ¸å°æ¸ææ¬éï¼å¤§å¹ æåè½æãè»ä»¶éç¼éè¦é©æé種è®åï¼ç®æ³åæ¸æçµæ§å¯è½éè¦éæ°è¨è¨ï¼ä»¥å©ç¨æ°çå §åè¨ªåæ¨¡å¼åè¨ç®åèªã
- 系統層ç´çè»ç¡¬ä»¶ååè¨è¨ï¼çºäºæå°ç¹å®é åï¼å¦èªåé§é§ãç©è¯ç¶²ãäºè¨ç®ï¼çèå»éæ±ï¼è»ç¡¬ä»¶ååè¨è¨è®å¾è³ééè¦ã徿令鿶æ§ï¼å¦RISC-Vçéæ¾èæ¨¡å¡åï¼ãç¡¬ä»¶å¾®æ¶æ§ï¼å°æä½ç³»çµ±ãç·¨è¯å¨ãéè¡æåº«ï¼é½å¨é²è¡ä¸é«ååªåãä¾å¦ï¼è°·æçºå ¶TPUå®å¶äºTensorFlowæ¡æ¶åç·¨è¯å¨æ£§ï¼æ°èçé åå°ç¨æ¶æ§ï¼DSAsï¼åé åå°ç¨èªè¨ï¼DSLsï¼æ£æ¯éä¸è¶¨å¢çé«ç¾ï¼æ¨å¨è®è»ä»¶æ´é«æå°âé§é¦âå°ç¨ç¡¬ä»¶ã
- äºãéã端ååçæ³å¨è¨ç®ï¼è¨ç®è³æºä¸åå±éäºæ¸æä¸å¿æå人è¨åï¼èæ¯åå¸å¨äºãéç·£ç¯é»åçµç«¯è¨åä¸ï¼å½¢æä¸åååçé£çºé«ãéè¦æ±ç³»çµ±è½å¤ åæ èª¿åº¦ä»»åãé·ç§»æ¸æï¼å¹¶ä¿èå®å ¨æ§ãä½å»¶é²åé±ç§ãç¸æçï¼è»ä»¶éç¼ééç¨å¾®æåãç¡æåå¨è¨ç®ãéç·£è¨ç®æ¡æ¶ï¼å¹¶èç好åå¸å¼çä¸è´æ§åçæ ç®¡çåé¡ã
- å®å ¨èå¯ä¿¡æçºåºç¤å±¬æ§ï¼å¾ç¡¬ä»¶å±¤é¢çå¯ä¿¡å·è¡ç°å¢ï¼TEEï¼å¦Intel SGXãARM TrustZoneï¼ãå §åå®å ¨ç¡¬ä»¶æ´å±ï¼å°ç³»çµ±å±¤é¢çæ©å¯è¨ç®ï¼å®å ¨å·²è¢«æåå°åºå±¤æ¶æ§è¨è¨çé«åº¦ãè»ä»¶éç¼å¿ é å¾ä¸éå§å°±å°å®å ¨èéèå ¥ï¼å©ç¨ç¡¬ä»¶æä¾çå®å ¨ç¹æ§ä¾æ§å»ºæ´å¯ä¿¡çæç¨ã
äºãè»ä»¶éç¼èå¼çæ·±å»è®é©
ä¸è¿°ç¡¬ä»¶è系統趨å¢ï¼ç´æ¥é©
åäºè»ä»¶éç¼å¨å¤å層é¢ç驿°ï¼
- ç·¨ç¨æ½è±¡å±¤æ¬¡çæåèå¤å åï¼çºäºéä½ç°æ§ç·¨ç¨çå¾©éæ§ï¼æ´é«ç´çæ½è±¡ä¸æ·æ¶ç¾ãä¾å¦ï¼åºäºå¼µéçç·¨ç¨æ¨¡åï¼PyTorch/TensorFlowï¼é±èäºåºå±¤GPU/TPUçç´°ç¯ï¼é¢åç¹å®é åçèªè¨ï¼å¦Halideç¨äºååèçï¼è®éç¼è å°æ³¨äºç®æ³é輯èé硬件調度ã系統ç´ç·¨ç¨èªè¨ï¼å¦RustãZigï¼å å ¶å°å §åå®å ¨åå¹¶ç¼æ§çå¼·ä¿èï¼æ£è¢«ç¨äºæ§å»ºæ´å®å ¨ã髿ç系統åºå±¤è»ä»¶ï¼ä»¥å¹é ç¾ä»£ç¡¬ä»¶è½åã
- ç·¨è¯èåªåæè¡çæºè½åï¼ç·¨è¯å¨çä½ç¨å¾å³çµ±ç代碼翻è¯ï¼æ¼è®çºééµçæ§è½åªå弿ãç¾ä»£ç·¨è¯å¨ï¼å¦LLVMãMLIRï¼éç¨å¤å±¤ä¸é表示ï¼è½å¤ éå°ä¸åçå端硬件ï¼CPUãGPUãAIè¯çï¼é²è¡æ·±åº¦åªåï¼çè³é²è¡èªå調åªï¼Auto-Tuningï¼ãæ©å¨å¸ç¿æè¡ä¹è¢«ç¨äºæå°ç·¨è¯åªå決çï¼å¯¦ç¾æ´æºè½ç代碼çæã
- éç¼éç¶ä¸é«åèåºç¤è¨æ½å³ä»£ç¢¼ï¼äºåçåæçºäº¤ä»çæ®åï¼ä½¿å¾è»ä»¶çæ§å»ºãé¨ç½²ãç£æ§èåºå±¤è¨ç®åºç¤è¨æ½ç管çç·å¯ç¶å®ã容å¨ï¼Dockerï¼ãç·¨æï¼Kubernetesï¼ä»¥ååºç¤è¨æ½å³ä»£ç¢¼ï¼IaCï¼å·¥å ·ï¼è®è»ä»¶è½å¤ å¨ç°æ§ãåæ çç¡¬ä»¶è³æºæ± ä¸å½æ§éè¡ãéç¼è éè¦å ·åä¸å®ç系統éç¶è¦è§ã
- å°æ§è½ãè½æèææ¬ç精細å建模ï¼å¨äºè¨ç®æéä»è²»åç§»åè¨å黿± çºèªçèæ¯ä¸ï¼è»ä»¶éç¼ä¸ååªè¿½æ±å³°å¼æ§è½ï¼éå¿ é éæ³¨è½æåææ¬ãéç¼è éè¦å婿§è½åæå·¥å ·ï¼Profilerï¼ä¾åæç±é»ï¼çè§£å¾æç¨ä»£ç¢¼å°ç¡¬ä»¶æä»¤ç宿´éé·éæ¢ï¼å¹¶é²è¡éå°æ§åªåã
- å®å ¨éç¼å·¦ç§»ï¼é¨è硬件å®å ¨ç¹æ§çè±å¯ï¼è»ä»¶éç¼æµç¨éè¦æ´æ©å°éæå®å ¨å¯¦è¸ãä¾å¦ï¼å©ç¨æ¯æTEEçSDKéç¼æ©å¯è¨ç®æç¨ï¼å¨ä»£ç¢¼å¯©è¨å測試ä¸èæ ®å´ä¿¡éæ»æç硬件層é¢çå®å ¨å¨è 模åã
è¨ç®æ©çµæåçæ¯çè§£éäºç¼å±è¶¨å¢çåºç³ã徿¶é«ç®¡å°è¶ å¤§è¦æ¨¡éæé»è·¯ï¼å¾é¦®Â·è«¾ä¾æ¼æ¶æ§å°åç®ä¸é«ï¼ç¡¬ä»¶çæ¯ä¸æ¬¡æ¼é²é½å¨çºè»ä»¶åµé æ°çå¯è½æ§ï¼åæä¹è¨å®æ°çç´æãæªä¾çè»ä»¶éç¼å·¥ç¨å¸«ï¼ä¸å éè¦ç²¾éç®æ³åæ¸æçµæ§ï¼ééå°åºå±¤ç¡¬ä»¶æ¶æ§ã系統ç´åªå以å跨層次ååè¨è¨ææ·±å»ççè§£ã坿妿¤ï¼æè½é§é¦æ¥ç復éçè¨ç®ç³»çµ±ï¼éç¼åºé«æ§è½ãé«è½æãå®å ¨å¯é çä¸ä¸ä»£è»ä»¶ï¼çæ£éæ¾åºææ¸ç´å¢é·ç硬件ç®åæ½è½ãè¨ç®ç³»çµ±çå¿ ç¶æ¯è»ç¡¬ä»¶æ·±åº¦èåãåå嵿°çæªä¾ã