[Feature][Master] Add task caching mechanism to improve the running speed of repetitive tasks#13194
Merged
Merged
Conversation
Codecov Report
@@ Coverage Diff @@
## dev #13194 +/- ##
============================================
+ Coverage 39.37% 39.46% +0.09%
- Complexity 4278 4294 +16
============================================
Files 1066 1069 +3
Lines 40479 40641 +162
Branches 4657 4673 +16
============================================
+ Hits 15937 16040 +103
- Misses 22755 22812 +57
- Partials 1787 1789 +2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
caishunfeng
reviewed
Dec 15, 2022
Comment on lines
+349
to
+350
| while (true) { | ||
| TaskInstance cacheTaskInstance = taskInstanceDao.findTaskInstanceByCacheKey(cacheKey); |
Contributor
There was a problem hiding this comment.
Maybe it's not a loop action.
Member
Author
There was a problem hiding this comment.
There may be one cacheKey for multiple pieces of data. For example, two cache tasks will same cache key run almost simultaneously.
| if (tagCacheKey.contains(MERGE_TAG)) { | ||
| String[] split = tagCacheKey.split(MERGE_TAG); | ||
| if (split.length == 2) { | ||
| taskIdAndCacheKey = Pair.of(Integer.parseInt(split[0]), split[1]); |
Check notice
Code scanning / CodeQL
Missing catch of NumberFormatException
Radeity
reviewed
Dec 15, 2022
a0646a4 to
b7819e5
Compare
Member
Author
|
@Amy0104 @songjianet please help to review the front-end code |
zhongjiajie
previously approved these changes
Dec 16, 2022
zhongjiajie
reviewed
Dec 16, 2022
…ysql.sql Co-authored-by: Jay Chung <zhongjiajie955@gmail.com>
…ostgresql.sql Co-authored-by: Jay Chung <zhongjiajie955@gmail.com>
01e1163 to
e5f80d2
Compare
caishunfeng
approved these changes
Dec 16, 2022
|
SonarCloud Quality Gate failed. |
Member
Author
|
@caishunfeng @ruanwenjun please PTAL, thanks |
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.








Purpose of the pull request
close #13133
Like this :Flyter Caching
In machine learning workflow, if some tasks will be caching, the workflows will be executed faster
How to determine whether a task has been cached when the cache is executed, that is, how to determine whether a task can use the running result of another task?
For the task identified as
Cache Execution, when the task starts, a cache key will be generated, and the key is composed of the following fields and hashed:${}security-environment managementIf the task with cache identification runs, it will find whether there is data with the same cache key in the database,
If you do not need to cache, you can right-click the node to run
Clear cachein the workflow instance to clear the cache, which will clear the cache data of the current input parameters under this version.Brief change log
Front end
useCacheflag to all task plugin except the logical componenthandleRemoveTaskInstanceCachemethod to task instance clear cacheBackend
removeTaskInstanceCacheAPI to clear task instance cacheTaskCacheUtilsto manage the cache keycheckIsCacheExecutionbeforedispatcher.dispatchtask to workerTaskCacheEventHandlerto the handle cache task instead dispatch tasksaveCacheTaskInstanceafter the cache task successfully runDatabase
is_cacheinto tablet_ds_task_definition、t_ds_task_definition_log、t_ds_task_instancecache_keytot_ds_task_instanceDoc
Cache Executionparameters introductionVerify this pull request
This pull request is code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(or)
If your pull request contain incompatible change, you should also add it to
docs/docs/en/guide/upgrede/incompatible.md