diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml new file mode 100644 index 0000000..b0c2395 --- /dev/null +++ b/.github/workflows/ci.yml @@ -0,0 +1,28 @@ +name: ci +on: + push: + branches: + - master +permissions: + contents: write +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - name: Configure Git Credentials + run: | + git config user.name github-actions[bot] + git config user.email 41898282+github-actions[bot]@users.noreply.github.com + - uses: actions/setup-python@v5 + with: + python-version: 3.x + - run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV + - uses: actions/cache@v4 + with: + key: mkdocs-material-${{ env.cache_id }} + path: .cache + restore-keys: | + mkdocs-material- + - run: pip install mkdocs-material + - run: mkdocs gh-deploy --force diff --git a/.gitignore b/.gitignore index 844f4be..3b0dc41 100644 --- a/.gitignore +++ b/.gitignore @@ -4,4 +4,7 @@ *.log *.synctex.gz *.toc -_minted-* \ No newline at end of file +_minted-* + +# MkDocs' folder +.cache diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..45570d2 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,24 @@ +# Contributing to inzva Algorithm Program +Are you an algorithm enthusiast? Or did you just find a bug? It is not important when it comes to contribution, inzva is community driven. We'd love to have contribution from the community. + +## We Develop with Github +We use github to host code, to track issues and feature requests, as well as accept pull requests. + +## All Code Changes Through Pull Requests +Our documents/codes came from different folks and pull requests are the best way to discuss changes to the repository. We actively welcome your pull requests: + +1. Fork the repo and create your branch from `master`. Branch name convention is [username]/[kebab-case-short-branch-name] +2. If you've added/changed TeX files, make sure it is suitable with our format. +3. If you've added/changed code, make sure it is suitable according to our code style. +4. Open your PR and don't forget to fill in the blanks in the PR opening screen. +5. We'll assign a reviewer for the PR and you can discuss the changes afterwards. +6. Congratulations! Your PR has been merged. + +## Report bugs/errors using Github's [issues](https://github.com/inzva/Algorithm-Program/issues) +We use GitHub issues to track bugs/errors. Report a bug by opening a new issue. Please, don't forget to state the issue well. Don't just open the issue with a title. And don't hesitate to propose a fix. Again, we'd love to have your contribution. + +## License +By contributing, you agree that your contributions will be licensed under its MIT License. + +## References +This document was adapted from [here](https://gist.github.com/briandk/3d2e8b3ec8daf5a27a62#file-contributing-md). diff --git a/README.md b/README.md index 686e756..a152dff 100644 --- a/README.md +++ b/README.md @@ -1,86 +1,67 @@ -# Algorithm-Competition-Program +# Algorithm Program -The **30-week** Algorithm Competition Programme 2018-2019, divided into Fall and Spring semesters, will include lectures, contests, problem-solvings and a variety of practises every saturday. +inzva Algorithm Program includes lectures, contests, problem-solving sessions and a variety of practices every Saturday, aimed at teaching advanced knowledge of algorithms to university students, spreading algorithmic thinking and providing training which will help them in international contests as well as in their professional lives. + +We prepared this full-fledged program to last weeks in order to grow the algorithm community in its technical capacity and ready the students for international contests. -After our National Fall, Winter and Summer Camps, we prepared this full-fledged programme to last all year in order to grow the algorithm community in it’s technical capacity and ready the students for international contests. +The participants are expected to not only have the skills, but also the enthusiasm and motivation for this unique program, which will be completely free of charge. The program will involve experienced editors to lecture the attendees, problem setters to prepare problems every week and reviewers to check their technical accuracy. The minimum required attendance is 60% for the current program, which will be evaluated by considering your presence in lectures and your regular participation in the weekly contests, is required for a participant to receive certificate of graduation. -The participants are expected to not only have the skills, but also the enthusiasm and motivation for this unique event, which will be completely free of charge.The programme will involve experienced editors to lecture the attendees, problem setters to prepare problems every week and reviewers to check their technical accuracy. **There will be a minimum requirement of %80 for attendance.** +Aside from meeting online every Saturday, we will keep in touch via the discord channel of the community. -Following the successful competitive programming communities in Romania, Bulgaria, Russia, Philippines and more, we strive to be a community that is eager to learn; where every member helps the other and the learners can also teach with the experienced. - -Aside from meeting at inzva every saturday, we will keep in touch via the discord channel of the community. **DATE & LOCATION** -Every Saturday, total of 30 days spread over 30 weeks. +Regular meetings occurs in Saturdays. Every batch has a different date, we will be publishing the exact dates at [inzva.com](inzva.com) before every program. -Week 1: September 29 Saturday -/ Week 30: May 25 Saturday/ inzva - Beykoz Kundura **MOTIVATION** -We believe that the main benefit comes from the opportunity to practice with challenging problems. Here are some other benefits we think the participants will acquire from the camp: +We believe that the main benefit comes from the opportunity to practice with challenging problems and taking a new step into the world of algorithms. Here are some other benefits we think the participants will acquire from the program: +- Receiving knowledge and personal experience from successful students in the community and getting one-on-one mentorship - Motivating yourself to improve your knowledge on a subject - Assessing yourself - Coding more efficient - Advanced knowledge of data structures and algorithms - Learning teamwork and critical thinking - Getting to know ICPC World better -- Technical adequacy and preparation for interviews -- The chance to be broadcasted as an Editor/Problem Setter/Reviewer or Attendee on inzva’s page. - -Those with %80+ attendance will have a certificate on Linkedin, GitHub and inzva.com as a graduate of this program, and those with proved success will be able to join our upcoming international summer camp directly. +- Technical adequacy and preparation for job interviews **TECHNICAL PROFICIENCY** All participants are expected to know a programming language well. Attendees must prepare their own programming environment (computer, IDE, compiler etc.). The whole practice process will run on [HackerRank](https://www.hackerrank.com) -You can find the curriculum [here](https://docs.google.com/spreadsheets/d/1f5r41dZ5-khcHL9ba_b2TNCLMg0QCqG-cEMdqlkwmGM/edit#gid=521339157) -The top three students will get prizes on the final contest day at the end of the year-long program. We will also have various surprises for those who make it to the top of the leaderboards with weekly contests. Provided, it’s about learning, teaching and sharing; not winning. +All participant who comply with 60% of the course and contests rules, will get a certificate and various surprizes during the program. Provided, it’s about learning, teaching and sharing; not winning. **FREQUENTLY ASKED QUESTIONS** -See the FAQ [here.](https://inzva.com/faq-algorithm-competition-programme) - -**HOW TO BE AN EDITOR/PROBLEM SETTER/REVIEWER** - -If you want to support the community as an **Editor, Problem Setter or Reviewer**, and get scholarship for your work by [BEV Foundation](https://bev.foundation/), you can find more information and application form via the links below. - -**EDITOR** - -Prepares the content for the week and lecture at inzva physically. [Read more](https://inzva.com/faq-algorithm-competition-programme) - -**PROBLEM SETTER** - -Prepares two contests consisting of 5 and 10 questions according to the lectures prepared by the editor. [Read more](https://inzva.com/faq-algorithm-competition-programme) - -**REVIEWER** +Every batch has different rules, selecting criterias and application requirements , we will be publishing FAQ at [inzva.com](inzva.com) before every program. -Reviews the content prepared by the editor and the problem setter, making sure it follows the guideline and curriculum. [Read -more](https://inzva.com/faq-algorithm-competition-programme) +**HOW TO BE AN EDITOR/PROBLEM SETTER** -We are proud to be founding a community together that will last years to come. With your support and contribution, we hope to be a sharing computer programming community for your generation and after. +If you want to support the community as an Editor or Problem Setter, and get scholarship from BEV Foundation for your effort , please contact us by sending an email to [algorithm@inzva.com](mailto:algorithm@inzva.com) with the subject “Being an Editor or Problem Setter for Algorithm Program" . **BUNDLES** | Name | Topics | |------|-------| | [01-Intro](https://github.com/inzva/Algorithm-Program/tree/master/bundles/01-intro) | Big O Notation, Recursion, Builtin Data Structures| -| [02-Algorithms-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/02-algorithms-1) | Sorting Algorithms, Search Algorithms| -| [03-Math-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/03-math-1) | Number Theory, Factorization, Combinatorics, Exponentiation| -| [04-Graph-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/04-graph-1) | Representing Graphs, Tree Traversals, Binary Search Tree, DFS, BFS, Union Find, Heap| -| [05-DP-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/05-dp-1) | Greedy Algorithms, Memoization, Common DP Problems| -| [06-Data-Structures-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/06-data-structures-1) | Stack, Queue, Deque, Linked List, Prefix Sum, Sparse Table, BIT, SQRT Decomposition, Segment Tree| -| [07-Graph-2](https://github.com/inzva/Algorithm-Program/tree/master/bundles/07-graph-2) | Bipartate Checking, Topoligical Sort, Shortest Path, Minimum Spanning Tree| -| [08-Data-Structures-2](https://github.com/inzva/Algorithm-Program/tree/master/bundles/08-data-structures-2) |Self Balancing Binary Trees, Lowest Common Ancestor in a Tree| -| [09-Data-Structures-3](https://github.com/inzva/Algorithm-Program/tree/master/bundles/09-data-structures-3) |Segment Tree with Lazy Propogation, Binary Search on Segment Tree, Mo's Algorithm, Trie| -| [10-DP-2/](https://github.com/inzva/Algorithm-Program/tree/master/bundles/10-dp-2) |Bitmask DP, DP on Rooted Trees, DP on DAGs, Digit DP, Tree Child-Sibling Notation| -| [11-Graph-3](https://github.com/inzva/Algorithm-Program/tree/master/bundles/11-graph-3) |Bridges and Articulation Points, SCC, BCC, Max Flow| -| [12-Math-3](https://github.com/inzva/Algorithm-Program/tree/master/bundles/12-math-3) |Vector Calculus, Area Calculation, Lines and Planes, Intersection, Convex Hull Problem, Rotating Calipers, Closest Pair Problem| -| [13-graph-5](https://github.com/inzva/Algorithm-Program/tree/master/bundles/13-graph-5) |Segment Tree on a Tree, Heavy-Light Decomposition, Centroid Decomposition of a Tree, Subtrees' Set-Swap Technique| -| [14-Algorithms-5](https://github.com/inzva/Algorithm-Program/tree/master/bundles/14-algorithms-5) |KMP, Robin-Karp Algorithm, Suffix Array, Longest Common Prefix Array| +| [02-Algorithms-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/02-algorithms-1) | Binary Search, Ternary Search, Sorting Algorithms, Quickselect, Divide and Conquer| +| [03-Math-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/03-math-1) | Number Theory, Sieve of Eratosthenes, Inverse Modular, GCD, LCM, Factorization, Combinatorics, Exponentiation, Meet in the Middle| +| [04-Graph-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/04-graph-1) | Representing Graphs, Tree Traversals (Preorder, Inorder, Postorder), Binary Search Tree, DFS, BFS, Union Find (DSU), Heap| +| [05-DP-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/05-dp-1) | Greedy Algorithms, Dynamic Programming, Memoization, Knapsack, Coin Problem, LCS, LIS| +| [06-Data-Structures-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/06-data-structures-1) | Stack, Queue, Deque, Linked List, Prefix Sum, Sparse Table, Binary Indexed Tree, SQRT Decomposition, Segment Tree| +| [07-Graph-2](https://github.com/inzva/Algorithm-Program/tree/master/bundles/07-graph-2) | Bipartate Checking, Topoligical Sort, Shortest Path (Dijkstra, Floyd-Warshall, Bellman Ford), Minimum Spanning Tree (Prim's, Kruskal's)| +| [08-Data-Structures-2](https://github.com/inzva/Algorithm-Program/tree/master/bundles/08-data-structures-2) | Self Balancing Binary Trees, Treap, AVL Tree, Red Black Tree, Lowest Common Ancestor| +| [09-Data-Structures-3](https://github.com/inzva/Algorithm-Program/tree/master/bundles/09-data-structures-3) | Segment Tree with Lazy Propogation, Binary Search on Segment Tree, Mo's Algorithm, Trie| +| [10-DP-2/](https://github.com/inzva/Algorithm-Program/tree/master/bundles/10-dp-2) | Bitmask DP, DP on Rooted Trees, DP on DAGs, Digit DP, Tree Child-Sibling Notation| +| [11-Graph-3](https://github.com/inzva/Algorithm-Program/tree/master/bundles/11-graph-3) | Bridges and Articulation Points, Strongly Connected Components (SCC), BCC, Cycle Finding, Max Flow| +| [12-Math-3](https://github.com/inzva/Algorithm-Program/tree/master/bundles/12-math-3) | Vector Calculus, Area Calculation, Lines and Planes, Intersection, Convex Hull Problem, Rotating Calipers, Closest Pair Problem| +| [13-graph-5](https://github.com/inzva/Algorithm-Program/tree/master/bundles/13-graph-5) | Segment Tree on a Tree, Heavy-Light Decomposition, Centroid Decomposition of a Tree, Subtrees' Set-Swap Technique| +| [14-Algorithms-5](https://github.com/inzva/Algorithm-Program/tree/master/bundles/14-Algorithms-5) |String Matching Algorithms: KMP, Robin-Karp Algorithm, Suffix Array, Longest Common Prefix Array| + + @@ -165,11 +146,15 @@ We are proud to be founding a community together that will last years to come. W | [inzva Algorithm Program 2018-2019 Graph-1 Onsite](https://www.hackerrank.com/inzva-04-graph-1-onsite-2018) | No Specific Topic| | [inzva Algorithm Program 2018-2019 DP-1 Online](https://www.hackerrank.com/inzva-05-dp-1-online-2018) | No Specific Topic| | [inzva Algorithm Program 2018-2019 DP-1 Onsite](https://www.hackerrank.com/inzva-05-dp-1-onsite-2018) | No Specific Topic| +| [inzva Fall Term Contest 2018](https://www.hackerrank.com/inzva-first-term-2018) | No Specific Topic| | [inzva Algorithm Program 2018-2019 Graph-2 Online](https://www.hackerrank.com/inzva-07-graph-2-online-2019) | No Specific Topic| | [inzva Algorithm Program 2018-2019 Graph-2 Onsite](https://www.hackerrank.com/inzva-07-graph-2-onsite-2019) | No Specific Topic| | [inzva Algorithm Program 2018-2019 Data Structures-2 Online](https://www.hackerrank.com/inzva-08-data-structures-2-online-2019) | No Specific Topic| -| [inzva Algorithm Program 2018-2019 DP-2 Online](https://www.hackerrank.com/inzva-10-dp-2-online-2019) | No Specific Topic| | [inzva Algorithm Program 2018-2019 Data Structures-2 Onsite](https://www.hackerrank.com/inzva-08-data-structures-2-onsite-2019) | No Specific Topic| +| [inzva Algorithm Program 2018-2019 Data Structures-3 Online](https://www.hackerrank.com/inzva-09-data-structures-3-online-2019) | No Specific Topic| +| [inzva Algorithm Program 2018-2019 Data Structures-3 Onsite](https://www.hackerrank.com/inzva-09-data-structures-3-onsite-2019) | No Specific Topic| +| [inzva Algorithm Program 2018-2019 DP-2 Online](https://www.hackerrank.com/inzva-10-dp-2-online-2019) | No Specific Topic| +| [inzva Algorithm Program 2018-2019 DP-2 Onsite](https://www.hackerrank.com/inzva-10-dp-2-onsite-2019) | No Specific Topic| | [inzva Algorithm Program 2018-2019 Graph-3 Online](https://www.hackerrank.com/inzva-11-graph-3-online-2019) | No Specific Topic| | [inzva Algorithm Program 2018-2019 Graph-3 Onsite](https://www.hackerrank.com/inzva-11-graph-3-onsite-2019) | No Specific Topic| | [inzva Algorithm Program 2018-2019 Math-3 Online](https://www.hackerrank.com/inzva-12-math-3-online-2019) | No Specific Topic| @@ -206,6 +191,7 @@ We are proud to be founding a community together that will last years to come. W | Name | Topic | |------|-------| +| [inzva Algorithm Competition Summer Camp 2019 Qualification](https://www.hackerrank.com/inzva-algorithm-competition-summer-camp-2019-qualification) | No Specific Topic| | [inzva ACSC 2019 Advanced #1](https://www.hackerrank.com/inzva-acsc-19-advanced-1) | No Specific Topic| | [inzva ACSC 2019 Advanced #2](https://www.hackerrank.com/inzva-acsc-19-advanced-2) | No Specific Topic| | [inzva ACSC 2019 Foundation Final](https://www.hackerrank.com/inzva-acsc-19-foundation-final) | No Specific Topic| @@ -298,4 +284,4 @@ We are proud to be founding a community together that will last years to come. W | Name | Topic | |------|-------| | [inzva Intermediate Training Set](https://www.hackerrank.com/inzva-intermediate-training-set) | No Specific Topic| - \ No newline at end of file + diff --git a/bundles/03-math-1/03_math1.pdf b/bundles/03-math-1/03_math1.pdf index b0f2b3f..9705939 100644 Binary files a/bundles/03-math-1/03_math1.pdf and b/bundles/03-math-1/03_math1.pdf differ diff --git a/bundles/03-math-1/README.md b/bundles/03-math-1/README.md index 7172a70..df553b7 100644 --- a/bundles/03-math-1/README.md +++ b/bundles/03-math-1/README.md @@ -1,5 +1,5 @@ -Onsite Contest --------------- +Onsite Contest +-------------- * https://www.hackerrank.com/inzva-03-math-1-onsite-2018 Online Contest diff --git a/bundles/03-math-1/latex/03_math1.tex b/bundles/03-math-1/latex/03_math1.tex index 9a5c90d..683f94a 100644 --- a/bundles/03-math-1/latex/03_math1.tex +++ b/bundles/03-math-1/latex/03_math1.tex @@ -482,7 +482,7 @@ \subsubsection{Properties of Modular Arithmetic} \end{itemize} \paragraph{\textbf{Multiplication}} \begin{itemize} - \item if $a = c$, then $a \mod{n} \cdot b \mod{n} \equiv c \mod{n}$. + \item if $a \cdot b = c$, then $a \mod{n} \cdot b \mod{n} \equiv c \mod{n}$. \item if $a \equiv b \mod{n}$, then $a \cdot k \equiv b \cdot k \mod{n}$ for any integer $k$. \item if $a \equiv b \mod{n}$ and $c \equiv d \mod{n}$, then $(a \cdot c) \equiv (b \cdot d) \mod{n}$ \end{itemize} @@ -1348,16 +1348,24 @@ \subsection{Fast Exponentiation Approach } #define mod 1000000007 long long fastExp(long long n, long long k){ - if(k == 0) return 1; - if(k == 1) return n; + if (k == 0) + return 1; - long long temp = fastExp(n, k>>1); + n %= mod; + long long temp = fastExp(n, k >> 1); // If k is odd return n * temp * temp // If k is even return temp * temp - // Take mod, since we can have a large number that overflows from long long - if((k&1) == 1) return (n * temp * temp) % mod - return (temp * temp) % mod; + + // The product of two factors that are each bounded by mod is bounded by + // mod squared. If we multiply this product with yet another factor that + // is bounded by mod, the result will be bounded by mod cubed. As mod is + // equal to 1000000007 by default, the result might not fit into + // long longs, leading to an overflow. To avoid that, we must take + // the modulus of n * temp before multiplying it with temp yet again. + if (k & 1) + return n * temp % mod * temp % mod; + return temp * temp % mod; } int main() { long long n,k; @@ -1366,14 +1374,13 @@ \subsection{Fast Exponentiation Approach } // Calculate the runtime of the function. clock_t tStart = clock(); - long long res = expon(n, k); + long long res = fastExp(n, k); printf("Time taken: %.6fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC); printf("%lld\n", res); return 0; } - \end{minted} \textbf{Output} @@ -1668,4 +1675,4 @@ \subsection{Meet in the Middle } \end{thebibliography} -\end{document} +\end{document} \ No newline at end of file diff --git a/bundles/06-data-structures-1/06-data-structures-1.pdf b/bundles/06-data-structures-1/06-data-structures-1.pdf index 044d36d..f90143c 100644 Binary files a/bundles/06-data-structures-1/06-data-structures-1.pdf and b/bundles/06-data-structures-1/06-data-structures-1.pdf differ diff --git a/bundles/06-data-structures-1/latex/06-data-structures-1.tex b/bundles/06-data-structures-1/latex/06-data-structures-1.tex index 3a929dc..a9a5498 100644 --- a/bundles/06-data-structures-1/latex/06-data-structures-1.tex +++ b/bundles/06-data-structures-1/latex/06-data-structures-1.tex @@ -134,7 +134,7 @@ \textbf { inzva Algorithm Programme 2018-2019\\ \ \\ Bundle 6 \\ \ \\ - Veri Yap{\i}lar{\i}- 1 \\ \ \\ + Veri Yap{\i}lar{\i} - 1 \\ \ \\ } } \title{\vspace{-2em}\mytitle\vspace{-0.3em}} @@ -143,7 +143,9 @@ Tahsin Enes Kuru \\ \ \\ \textbf{Reviewers} \\ Baha Eren Yald{\i}z \\ - Burak Bu\u{g}rul + Burak Bu\u{g}rul \\ \ \\ + \textbf{Contributor} \\ + Kerim Kochekov \\ } \date{} \begin{document} @@ -160,11 +162,10 @@ \markboth{Table of Contents}{} \cleardoublepage - \section{Giri\c{s}} + \section{Giri\c{s}} \label{introduction} - Bilgisayar biliminde veri yap{\i}lar{\i}, belirli bir eleman k\"{u}mesi \"{u}zerinde verimli bir \c{s}eklide bilgi edinmemize ayn{\i} zamanda bu elemanlar \"{u}zerinde de\u{g}i\c{s}iklikler yapabilmemize olanak sa\u{g}layan yap{\i}lard{\i}r. \c{C}al{\i}\c{s}ma prensipleri genellikle elemanlar{\i}n de\u{g}erlerini belirli bir kurala g\"{o}re saklamak daha sonra bu yap{\i}lar{\i} kullanarak elemanlar hakk{\i}nda sorulara (Bir dizinin belirli bir aral{\i}\u{g}{\i}ndaki en k\"{u}\c{c}\"{u}k say{\i} gibi) cevap aramakt{\i}r. - \section{Dinamik Veri Yap{\i}lar{\i}} - + Bilgisayar biliminde veri yap{\i}lar{\i}, belirli bir eleman k\"{u}mesi \"{u}zerinde verimli bir \c{s}eklide bilgi edinmemize ayn{\i} zamanda bu elemanlar \"{u}zerinde de\u{g}i\c{s}iklikler yapabilmemize olanak sa\u{g}layan yap{\i}lard{\i}r. \c{C}al{\i}\c{s}ma prensipleri genellikle elemanlar{\i}n de\u{g}erlerini belirli bir kurala g\"{o}re saklamak daha sonra bu yap{\i}lar{\i} kullanarak elemanlar hakk{\i}nda sorulara (mesela, bir dizinin belirli bir aral{\i}\u{g}{\i}ndaki en k\"{u}\c{c}\"{u}k say{\i}y{\i} bulmak gibi) cevap aramakt{\i}r. + \section{Dinamik Veri Yap{\i}lar{\i}} \label{dynamic} \subsection{Linked List} Linked List veri yap{\i}s{\i}nda elemanlar, her eleman kendi de\u{g}erini ve bir sonraki eleman{\i}n adresini tutacak \c{s}ekilde saklan{\i}r. Yap{\i}daki elemanlar ba\c{s} elemandan(head) ba\c{s}lanarak son elemana(tail) gidecek \c{s}ekilde gezilebilir. Diziye kar\c{s}{\i}n avantaj{\i} haf{\i}zan{\i}n dinamik bir \c{s}ekilde kullan{\i}lmas{\i}d{\i}r. Bu veri yap{\i}s{\i}nda uygulanabilecek i\c{s}lemler: @@ -184,7 +185,7 @@ \clearpage \begin{minted}[frame=lines,linenos,fontsize=\footnotesize]{c++} -// Her bir elemani tutacak struct olusturuyoruz. +// Her bir elemani (burada sayilari, yani int) tutacak struct olusturuyoruz. struct node { int data; @@ -226,9 +227,10 @@ Stack veri yap{\i}s{\i}nda elemanlar yap{\i}ya son giren ilk \c{c}{\i}kar (LIFO) kural{\i}na uygun olacak \c{s}ekilde saklan{\i}r. Bu veri yap{\i}s{\i}nda uygulayabildi\u{g}imiz i\c{s}lemler: \begin{itemize} - \item Veri yap{\i}s{\i}n{\i}n en \"{u}st\"{u}ne eleman ekleme. + \item Veri yap{\i}s{\i}n{\i}n en \"{u}st\"{u}ne eleman ekleme . \item Veri yap{\i}s{\i}n{\i}n en \"{u}st\"{u}ndeki elemana eri\c{s}im. \item Veri yap{\i}s{\i}n{\i}n en \"{u}st\"{u}ndeki eleman{\i} silme. + \item Veri yap{\i}s{\i}n{\i}n bo\c{s} olup olmad{\i}\u{g}{\i}n{\i}n kontr\"{o}l\"{u}. \end{itemize} c++ dilindeki stl k\"{u}t\"{u}phanesinde bulunan haz{\i}r stack yap{\i}s{\i}n{\i}n kullan{\i}m{\i} a\c{s}a\u{g}{\i}daki gibidir. @@ -236,12 +238,14 @@ \begin{minted}[frame=lines,linenos,fontsize=\footnotesize]{c++} int main() { stack < int > st; + cout << st.empty() << endl; // Ilk bashta Stack bosh oldugu icin burada True donecektir. st.push(5); // Stack'in en ustune 5'i ekler. Stack'in yeni hali: {5} - st.push(7); // Stack'in en ustune 7'yi ekler. Stack'in yeni hali: {7,5} - st.push(6); // Stack'in en ustune 6'yiekler. Stack'in yeni hali : {6, 7, 5} + st.push(7); // Stack'in en ustune 7'yi ekler. Stack'in yeni hali: {7, 5} + st.push(6); // Stack'in en ustune 6'yi ekler. Stack'in yeni hali : {6, 7, 5} st.pop(); //Stack'in en ustundeki elemani siler. Stack'in yeni hali : {7, 5} - st.push(1); // Stack'in en ustune 1'i ekler. Stack'in yeni hali : {1 7, 5} + st.push(1); // Stack'in en ustune 1'i ekler. Stack'in yeni hali : {1, 7, 5} cout << st.top() << endl; // Stack'in en ustundeki elemana erisir. Ekrana 1 yazirir. + cout << st.empty() << endl; // Burada Stack bosh olmadigindan oturu False donecektir. } \end{minted} @@ -253,6 +257,7 @@ \item Veri yap{\i}s{\i}n{\i}n en \"{u}st\"{u}ne eleman ekleme. \item Veri yap{\i}s{\i}n{\i}n en alt{\i}ndaki eleman{\i}na eri\c{s}im. \item Veri yap{\i}s{\i}n{\i}n en alt{\i}ndaki eleman{\i} silme. + \item Veri yap{\i}s{\i}n{\i}n bo\c{s} olup olmad{\i}\u{g}{\i}n{\i}n kontr\"{o}l\"{u}. \end{itemize} c++ dilindeki stl k\"{u}t\"{u}phanesinde bulunan haz{\i}r queue yap{\i}s{\i}n{\i}n kullan{\i}m{\i} a\c{s}a\u{g}{\i}daki gibidir. @@ -260,12 +265,13 @@ \begin{minted}[frame=lines,linenos,fontsize=\footnotesize]{c++} int main() { queue < int > q; - q.push(5); // queue'in en ustune 5'i ekler. queue'in yeni hali: {5} - q.push(7); // queue'in en ustune 7'yi ekler. queue'in yeni hali: {7,5} - q.push(6); // queue'in en ustune 6'yi ekler. queue'in yeni hali : {6, 7, 5} - q.pop(); //queue'in en altindaki elemani siler. queue'in yeni hali : {6,7} - q.push(1); // queue'in en ustune 1'i ekler. queue'in yeni hali : {1,6,7} - cout << q.front() << endl; // queue'in en ustundeki elemana erisir. Ekrana 7 yazirir. + cout << q.empty() << endl; // Ilk bashta Queue bosh oldugu icin burada True donecektir. + q.push(5); // Queue'in en ustune 5'i ekler. Queue'in yeni hali: {5} + q.push(7); // Queue'in en ustune 7'yi ekler. Queue'in yeni hali: {7, 5} + q.push(6); // Queue'in en ustune 6'yi ekler. Queue'in yeni hali : {6, 7, 5} + q.pop(); //Queue'in en altindaki elemani siler. Queue'in yeni hali : {6, 7} + q.push(1); // Queue'in en ustune 1'i ekler. Queue'in yeni hali : {1, 6, 7} + cout << Q.front() << endl; // Queue'in en ustundeki elemana erisir. Ekrana 7 yazdirir. } \end{minted} @@ -296,11 +302,12 @@ q.pop_back(); // deque'nin en ustundeki elemanini silme. } \end{minted} + \textbf{p.s}, deque veri yap{\i}s{\i} stack ve queue veri yap{\i}lar{\i}na g\"{o}re daha kapsaml{\i} oldu\u{g}undan \"{o}t\"{u}r\"{u} stack ve queue veri yapilarina g\"{o}re 2 kat fazla memory kulland{\i}\u{g}{\i}n{\i} a\c{c}{\i}kl{\i}kla s\"{o}yleyebiliriz. \cleardoublepage - \section{Prefix Sum} + \section{Prefix Sum} \label{prefixsum} - Prefix Sum dizisi bir dizinin prefixlerinin toplamlar{\i}yla olu\c{s}turulan bir veri yap{\i}s{\i}d{\i}r. Prefix sum dizisinin i indeksli eleman{\i} girdi dizisindeki 1 indeksli elemandan i indeksli elemana kadar olan elemanlar{\i}n toplam{\i}na e\c{s}it olacak \c{s}ekilde kurulur. Ba\c{s}ka bir de\u{g}i\c{s}le: $$sum_i = \sum_{j=1}^{i} {a_j} $$ \"{O}rnek bir $A$ dizisi i\c{c}in prefix sum dizisi \c{s}u \c{s}ekilde kurulmal{\i}d{\i}r: + Prefix Sum dizisi bir dizinin prefixlerinin toplamlar{\i}yla olu\c{s}turulan bir veri yap{\i}s{\i}d{\i}r. Prefix sum dizisinin i indeksli eleman{\i} girdi dizisindeki 1 indeksli elemandan i indeksli elemana kadar olan elemanlar{\i}n toplam{\i}na e\c{s}it olacak \c{s}ekilde kurulur. Ba\c{s}ka bir deyi\c{s}le: $$sum_i = \sum_{j=1}^{i} {a_j} $$ \"{O}rnek bir $A$ dizisi i\c{c}in prefix sum dizisi \c{s}u \c{s}ekilde kurulmal{\i}d{\i}r: \begin{table}[h] \centering @@ -320,10 +327,11 @@ \subsection{\"{O}rnek Kod Par\c{c}alar{\i}} - $sum_i = sum_{i -1} + a_i$ e\c{s}itli\u{g}i kolayca g\"{o}r\"{u}l\"{u}r. Ve bu e\c{s}itli\u{g}i kullanarak Prefix Sum dizisini girdi dizisindeki elemanlar{\i} s{\i}rayla gezerek kurabiliriz. + Prefix Sum dizisini kurarken $sum_i = sum_{i -1} + a_i$ e\c{s}itli\u{g}i kolayca g\"{o}r\"{u}lebilir ve bu e\c{s}itli\u{g}i kullanarak $sum[]$ dizisini girdi dizisindeki elemanlar{\i} s{\i}rayla gezerek kurabiliriz. \begin{minted}[frame=lines,linenos,fontsize=\footnotesize]{c++} -int n,sum[N],a[N]; +const int n; +int sum[n+1], a[n+1]; // a dizisi girdi dizimiz, sum dizisi de prefix sum dizimiz olsun. void build() { @@ -341,23 +349,26 @@ Prefix sum dizisini kurma i\c{s}lemimizin zaman ve haf{\i}za karma\c{s}{\i}kl{\i}\u{g}{\i} $O(N)$. Her sorguya da $O(1)$ karma\c{s}{\i}kl{\i}kta cevap verebiliyoruz. - Prefix sum veri yap{\i}s{\i} ile ilgili problem: \href{https://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=24&page=show_problem&problem=1474}{Link}. + Prefix sum veri yap{\i}s{\i} ile ilgili problem: \href{https://codeforces.com/problemset/problem/816/B}{Link}. \cleardoublepage - \section{Sparse Table} + \section{Sparse Table} \label{sparsetable} - Sparse Table aral{\i}klardaki elemanlar{\i}n toplam{\i}, minimumu, maksimumu ve eboblar{\i} gibi sorgulara $O(logN)$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda cevap alabilmemizi sa\u{g}layan bir veri yap{\i}s{\i}d{\i}r. Baz{\i} tip sorgular (aral{\i}ktaki minimum, maksimum say{\i}y{\i} bulma gibi) ise $O(1)$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda yapmaya uygundur. + Sparse table aral{\i}klardaki elemanlar{\i}n toplam{\i}, minimumu, maksimumu ve eboblar{\i} gibi sorgulara $O(\log(N))$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda cevap alabilmemizi sa\u{g}layan bir veri yap{\i}s{\i}d{\i}r. Baz{\i} tip sorgular (aral{\i}ktaki minimum, maksimum say{\i}y{\i} bulma gibi) ise $O(1)$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda yapmaya uygundur. - Bu veri yap{\i}s{\i} durumu de\u{g}i\c{s}meyen, sabit bir veri \"{u}zerinde \"{o}n i\c{s}lemler yaparak kurulur. Dinamik veriler i\c{c}in kullan{\i}\c{s}l{\i} de\u{g}ildir. Veri \"{u}zerinde herhangi bir de\u{g}i\c{s}iklik durumda Sparse Table tekrardan kurulmal{\i}d{\i}r. Bu da maliyetli bir durumdur. + Bu veri yap{\i}s{\i} durumu de\u{g}i\c{s}meyen, sabit bir veri \"{u}zerinde \"{o}n i\c{s}lemler yaparak kurulur. Dinamik veriler i\c{c}in kullan{\i}\c{s}l{\i} de\u{g}ildir. Veri \"{u}zerinde herhangi bir de\u{g}i\c{s}iklik durumda Sparse table tekrardan kurulmal{\i}d{\i}r. Bu da maliyetli bir durumdur. \subsection{Yap{\i}s{\i} ve Kurulu\c{s}u} - Sparse table iki bouyutlu bir dizi \c{s}eklinde, $O(NlogN)$ haf{\i}za karma\c{s}{\i}kl{\i}\u{g}{\i}na sahip bir veri yap{\i}s{\i}d{\i}r. Dizinin her eleman{\i}ndan 2'nin kuvvetleri uzakl{\i}ktaki elemanlara kadar olan cevaplar Sparse tableda saklan{\i}r. $ST_{x,i}$, $x$ indeksli elemandan $x + 2^i$ indeksli elemana kadar olan aral{\i}\u{g}{\i}n cevab{\i}n{\i} saklayacak \c{s}ekilde sparse table kurulur. + Sparse table iki bouyutlu bir dizi \c{s}eklinde, $O(N\log(N))$ haf{\i}za karma\c{s}{\i}kl{\i}\u{g}{\i}na sahip bir veri yap{\i}s{\i}d{\i}r. Dizinin her eleman{\i}ndan 2'nin kuvvetleri uzakl{\i}ktaki elemanlara kadar olan cevaplar Sparse table'da saklan{\i}r. $ST_{x,i}$, $x$ indeksli elemandan $x + 2^i - 1$ indeksli elemana kadar olan aral{\i}\u{g}{\i}n cevab{\i}n{\i} saklayacak \c{s}ekilde sparse table kurulur. \begin{minted}[frame=lines,linenos,fontsize=\footnotesize]{c++} //Toplam sorgusu icin kurulmus Sparse Table Yapisi -void build() { +const int n; +const int LOG = log2(n); +int a[n+1], ST[2*n][LOG+1]; +void build() { for (int i = 1 ; i <= n ; i++) { // [i,i] araliginin cevabi dizinin i indeksli elemanina esittir. ST[i][0] = a[i]; @@ -365,8 +376,8 @@ for (int i = 1 ; i <= LOG ; i++) for (int j = 1 ; j <= n ; j++) { - // [i,i+2^(j)] araliginin cevabi - // [i,i+2^(j - 1) - 1] araligi ile [i+2^(j - 1),i+2^j] araliginin + // [i,i+2^(j)-1] araliginin cevabi + // [i,i+2^(j - 1) - 1] araligi ile [i+2^(j - 1),i+2^j-1] araliginin // cevaplarinin birlesmesiyle elde edilir ST[i][j] = ST[i][j - 1] + ST[i + (1 << (j - 1))][j - 1]; } @@ -381,13 +392,13 @@ Herhangi bir $[l,r]$ aral{\i}\u{g}{\i} i\c{c}in sorgu algoritmas{\i} s{\i}ras{\i}yla \c{s}u \c{s}ekilde \c{c}al{\i}\c{s}{\i}r: \begin{itemize} - \item $[l,r]$ aral{\i}\u{g}{\i}n{\i} cevaplar{\i}n{\i} \"{o}nceden hesaplad{\i}\u{g}{\i}m{\i}z aral{\i}klara par\c{c}ala. (Sadece $2$'nin kuvveti uzunlu\u{g}unda par\c{c}alar{\i}n cevaplar{\i}n{\i} saklad{\i}\u{g}{\i}m{\i}z i\c{c}in aral{\i}\u{g}{\i}m{\i}z{\i} $2$'nin kuvveti uzunlu\u{g}unda aral{\i}klara ay{\i}rmal{\i}y{\i}z. $[l,r]$ aral{\i}\u{g}{\i}n{\i}n uzunlu\u{g}unun iklik tabanda yazd{\i}\u{g}{\i}m{\i}zda hangi aral{\i}klara par\c{c}alamam{\i}z gerekti\u{g}ini bulmu\c{s} oluruz.) + \item $[l,r]$ aral{\i}\u{g}{\i}n{\i} cevaplar{\i}n{\i} \"{o}nceden hesaplad{\i}\u{g}{\i}m{\i}z aral{\i}klara par\c{c}ala(sadece $2$'nin kuvveti uzunlu\u{g}unda par\c{c}alar{\i}n cevaplar{\i}n{\i} saklad{\i}\u{g}{\i}m{\i}z i\c{c}in aral{\i}\u{g}{\i}m{\i}z{\i} $2$'nin kuvveti uzunlu\u{g}unda aral{\i}klara ay{\i}rmal{\i}y{\i}z. $[l,r]$ aral{\i}\u{g}{\i}n{\i}n uzunlu\u{g}unun ikilik tabanda yazd{\i}\u{g}{\i}m{\i}zda hangi aral{\i}klara par\c{c}alamam{\i}z gerekti\u{g}ini bulmu\c{s} oluruz.) \item Bu aral{\i}klardan gelen cevaplar{\i} birle\c{s}tirerek $[l,r]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}n{\i} hesapla. \end{itemize} - Herhangi bir aral{\i}\u{g}{\i}n uzunlu\u{g}unun ikilik tabandaki yaz{\i}l{\i}\c{s}{\i}ndaki $1$ rakamlar{\i}n{\i}n say{\i}s{\i} en fazla $logN$ olabilece\u{g}inden par\c{c}alayaca\u{g}{\i}m{\i}z aral{\i}k say{\i}s{\i} da en fazla $logN$ olur. Dolay{\i}s{\i}yla sorgu i\c{s}lemimiz $O(logN)$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda \c{c}al{\i}\c{s}{\i}r. + Herhangi bir aral{\i}\u{g}{\i}n uzunlu\u{g}unun ikilik tabandaki yaz{\i}l{\i}\c{s}{\i}ndaki $1$ rakamlar{\i}n{\i}n say{\i}s{\i} en fazla $\log(N)$ olabilece\u{g}inden par\c{c}alayaca\u{g}{\i}m{\i}z aral{\i}k say{\i}s{\i} da en fazla $\log(N)$ olur. Dolay{\i}s{\i}yla sorgu i\c{s}lemimiz $O(\log(N))$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda \c{c}al{\i}\c{s}{\i}r. \"{O}rne\u{g}in: $[4,17]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}n{\i} hesaplamak i\c{c}in algoritmam{\i}z $[4,17]$ aral{\i}\u{g}{\i}n{\i} $[4,11]$,$[12,15]$ ve $[16,17]$ aral{\i}klar{\i}na ay{\i}r{\i}r ve bu $3$ aral{\i}ktan gelen cevaplar{\i} birle\c{s}tirerek istenilen cevab{\i} hesaplar. @@ -420,7 +431,7 @@ \begin{minted}[frame=lines,linenos,fontsize=\footnotesize]{c++} int RMQ(int l,int r) { - // log dizisinde her sayinin log2 degerleri sakldir. + // log[] dizisinde her sayinin onceden hesapadigimiz log2 degerleri saklidir. int j = log[r - l + 1]; return min(ST[l][j], ST[r - (1 << j) + 1][j]); } @@ -430,9 +441,9 @@ \cleardoublepage - \section{Binary Indexed Tree} + \section{Binary Indexed Tree} \label{bit} - Fenwick tree olarak da bilinen Binary Indexed Tree, Prefix sum ve Sparse Table yap{\i}lar{\i}na benzer bir yap{\i}da olup dizi \"{u}zerinde de\u{g}i\c{s}iklik yapabilmemize olanak sa\u{g}layan bir veri yap{\i}s{\i}d{\i}r. Fenwick Tree'nin di\u{g}er veri yap{\i}lar{\i}na g\"{o}re en b\"{u}y\"{u}k avantaj{\i} pratikte daha h{\i}zl{\i} olmas{\i} ve haf{\i}za karma\c{s}{\i}kl{\i}\u{g}{\i}n{\i}n $O(N)$ olmas{\i}d{\i}r. Ancak Fenwick Tree'de sadece prefix cevaplar{\i} saklayabildi\u{g}imizden aral{\i}klarda minimum, maksimum ve en b\"{u}y\"{u}k ortak b\"{o}len gibi baz{\i} sorgular{\i}n cevaplar{\i}n{\i} elde edemeyiz. +Fenwick Tree olarak da bilinen Binary Indexed Tree, Prefix Sum$^\ref{prefixsum}$ ve Sparse Table$^\ref{sparsetable}$ yap{\i}lar{\i}na benzer bir yap{\i}da olup dizi \"{u}zerinde de\u{g}i\c{s}iklik yapabilmemize olanak sa\u{g}layan bir veri yap{\i}s{\i}d{\i}r. Fenwick Tree'nin di\u{g}er veri yap{\i}lar{\i}na g\"{o}re en b\"{u}y\"{u}k avantaj{\i} pratikte daha h{\i}zl{\i} olmas{\i} ve haf{\i}za karma\c{s}{\i}kl{\i}\u{g}{\i}n{\i}n $O(N)$ olmas{\i}d{\i}r. Ancak Fenwick Tree'de sadece prefix cevaplar{\i} (veya suffix cevaplar{\i}) saklayabildi\u{g}imizden aral{\i}klarda minimum, maksimum ve ebob gibi baz{\i} sorgular{\i}n cevaplar{\i}n{\i} elde edemeyiz. \subsection{Yap{\i}s{\i} ve Kurulu\c{s}u} @@ -440,9 +451,9 @@ \begin{figure}[h] \centering - \includegraphics[width=\linewidth/1]{fenwick.png} + \includegraphics[scale=0.8]{fenwick.png} \label{fig:fenwick} - \caption{$a = [62,6,85,60,39,47,60,16,17]$ dizisinde toplam sorgusu i\c{c}in kurulmu\c{s} Fenwick Tree yap{\i}s{\i}} + \caption{8 uzulu\u{g}undaki bir dizi i\c{c}in kurulmu\c{s} Fenwick Tree yap{\i}s{\i}} \end{figure} \clearpage @@ -456,7 +467,7 @@ \item x'in de\u{g}erini $x - g(x)$ yap. E\u{g}er x'in yeni de\u{g}eri $0$'dan b\"{u}y\"{u}k ise 1.i\c{s}lemden hesaplamaya devam et. \end{enumerate} - $[1,x]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}n{\i} hesaplamak i\c{c}in yap{\i}lan i\c{s}lem say{\i}s{\i} $x$ say{\i}s{\i}n{\i}n $2$'lik tabandaki yaz{\i}l{\i}\c{s}{\i}ndaki $1$ say{\i}s{\i}na e\c{s}ittir. \c{C}\"{u}nk\"{u} her d\"{o}ng\"{u}de x'den $2$'lik tabandaki yaz{\i}l{\i}\c{s}{\i}ndaki en sa\u{g}daki $1$ bitini \c{c}{\i}kart{\i}yoruz. Dolay{\i}s{\i}yla sorgu i\c{s}lemimiz $O(logN)$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda \c{c}al{\i}\c{s}{\i}r. $[l,r]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}n{\i} da $[1,r]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}ndan $[1,l - 1]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}n{\i} \c{c}{\i}kararak kolay bir \c{s}ekilde elde edebiliriz. + $[1,x]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}n{\i} hesaplamak i\c{c}in yap{\i}lan i\c{s}lem say{\i}s{\i} $x$ say{\i}s{\i}n{\i}n $2$'lik tabandaki yaz{\i}l{\i}\c{s}{\i}ndaki $1$ say{\i}s{\i}na e\c{s}ittir. \c{C}\"{u}nk\"{u} her d\"{o}ng\"{u}de x'den $2$'lik tabandaki yaz{\i}l{\i}\c{s}{\i}ndaki en sa\u{g}daki $1$ bitini \c{c}{\i}kart{\i}yoruz. Dolay{\i}s{\i}yla sorgu i\c{s}lemimiz $O(\log(N))$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda \c{c}al{\i}\c{s}{\i}r. $[l,r]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}n{\i} da $[1,r]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}ndan $[1,l - 1]$ aral{\i}\u{g}{\i}n{\i}n cevab{\i}n{\i} \c{c}{\i}kararak kolay bir \c{s}ekilde elde edebiliriz. \fbox{ \parbox{\textwidth} @@ -473,7 +484,7 @@ \item A\u{g}a\c{c}ta $x$ indeksli eleman{\i} i\c{c}eren t\"{u}m d\"{u}\u{g}\"{u}mlerin de\u{g}erlerini g\"{u}ncelle. \end{itemize} - Fenwick Tree'de x indeksli eleman{\i} i\c{c}eren maksimum $logN$ tane aral{\i}k oldu\u{g}undan g\"{u}ncelleme algoritmas{\i} $O(logN)$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda \c{c}al{\i}\c{s}{\i}r. + Fenwick Tree'de x indeksli eleman{\i} i\c{c}eren maksimum $\log(N)$ tane aral{\i}k oldu\u{g}undan g\"{u}ncelleme algoritmas{\i} $O(\log(N))$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda \c{c}al{\i}\c{s}{\i}r. \clearpage @@ -481,9 +492,11 @@ \begin{minted}[frame=lines,linenos,fontsize=\footnotesize]{c++} -int n,tree[N],a[N]; +const int n; +int tree[n+1], a[n+1]; -void add(int val,int x) { // x indeksli elemanin degerini val degeri kadar artirir. +void add(int val, int x) { //x indeksli elemanin degerini val degeri kadar artirir. + //x indeksinin etkiledigi butun dugumleri val degeri kadar artirir. while(x <= n) { tree[x] += val; x += x & (-x); @@ -500,18 +513,20 @@ return res; } -int query(int l,int r) { // [l,r] araligindaki elemanlarin toplamini verir. +int query(int l, int r) { // [l,r] araligindaki elemanlarin toplamini verir. return sum(r) - sum(l - 1); } void build() { // a dizisi uzerine fenwick tree yapisini kuruyoruz. for (int i = 1 ; i <= n ; i++) - add(a[i],i); + add(a[i], i); return; } \end{minted} + Fenwick Tree veri yap{\i}s{\i} ile ilgili problem: \href{https://www.spoj.com/problems/CSUMQ/}{Link}. + \subsection{Aral{\i}k G\"{u}ncelleme ve Eleman Sorgu} Bir $a$ dizisi \"{u}zerinde i\c{s}lemler yapaca\u{g}{\i}m{\i}z{\i} varsayal{\i}m daha sonra $a$ dizisi $b$ dizisinin prefix sum dizisi olacak \c{s}ekilde bir $b$ dizisi tan{\i}mlayal{\i}m. Ba\c{s}ka bir de\u{g}i\c{s}le $a_i = \sum_{j=1}^{i} {b_j} $ olmal{\i}d{\i}r. Sonradan olu\c{s}turdu\u{g}umuz $b$ dizisi \"{u}zerine fenwick tree yap{\i}s{\i}n{\i} kural{\i}m. $[l,r]$ aral{\i}\u{g}{\i}ndaki her elemana @@ -525,7 +540,8 @@ \subsubsection{\"{O}rnek Kod Par\c{c}alar{\i}} \begin{minted}[frame=lines,linenos,fontsize=\footnotesize]{c++} -int n,a[N],b[N]; +const int n; +int a[n+1], b[n+1]; void add(int val,int x) { // x indeksli elemanin degerini val degeri kadar artirir. while(x <= n) { @@ -548,12 +564,12 @@ b[i] = a[i] - a[i - 1]; // b dizisini olusturuyoruz. for (int i = 1 ; i <= n ; i++) - add(b[i],i); // b dizisi uzerine fenwick tree kuruyoruz. + add(b[i], i); // b dizisi uzerine fenwick tree kuruyoruz. } -void update(int l,int r,int x) { - add(x,l); - add(-x,r + 1); +void update(int l, int r, int x) { + add(x, l); + add(-x, r + 1); } void query(int x) { @@ -562,7 +578,6 @@ \end{minted} - Fenwick Tree veri yap{\i}s{\i} ile ilgili problem: \href{https://www.spoj.com/problems/CSUMQ/}{Link}. \cleardoublepage \section{SQRT Decomposition} @@ -676,10 +691,10 @@ \section{Segment Tree} - Segment Tree bir dizide $O(log N)$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda herhangi bir $[l,r]$ aral{\i}\u{g}{\i} icin minimum, maksimum, toplam gibi sorgulara cevap verebilmemize ve bu aral{\i}klar \"{u}zerinde g\"{u}ncelleme yapabilmemize olanak sa\u{g}layan bir veri yap{\i}s{\i}d{\i}r. + Segment Tree bir dizide $O(\log(N))$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda herhangi bir $[l,r]$ aral{\i}\u{g}{\i} icin minimum, maksimum, toplam gibi sorgulara cevap verebilmemize ve bu aral{\i}klar \"{u}zerinde g\"{u}ncelleme yapabilmemize olanak sa\u{g}layan bir veri yap{\i}s{\i}d{\i}r. - Segment Tree'nin, Fenwick Tree ve Sparse Table yap{\i}lar{\i}ndan farkl{\i} olarak elemanlar \"{u}zerinde g\"{u}ncelleme yap{\i}labilmesi ve minimum, maksimum gibi sorgulara da olanak sa\u{g}lamas{\i} y\"{o}n\"{u}nden daha kullan{\i}\c{s}l{\i}d{\i}r. Ayr{\i}ca segment tree $O(N)$ haf{\i}za karma\c{s}{\i}kl{\i}\u{g}{\i}na sahipken Sparse Table yaps{\i}n{\i}nda - gereken haf{\i}za karma\c{s}{\i}kl{\i}\u{g}{\i} $O(NlogN)$'dir. + Segment Tree'nin, Fenwick Tree$^\ref{bit}$ ve Sparse Table$^\ref{sparsetable}$ yap{\i}lar{\i}ndan farkl{\i} olarak elemanlar \"{u}zerinde g\"{u}ncelleme yap{\i}labilmesi ve minimum, maksimum gibi sorgulara da olanak sa\u{g}lamas{\i} y\"{o}n\"{u}nden daha kullan{\i}\c{s}l{\i}d{\i}r. Ayr{\i}ca Segment Tree $O(N)$ haf{\i}za karma\c{s}{\i}kl{\i}\u{g}{\i}na sahipken Sparse Table yaps{\i}n{\i}nda + gereken haf{\i}za karma\c{s}{\i}kl{\i}\u{g}{\i} $O(N \log(N))$'dir. \subsection {Yap{\i}s{\i} ve Kurulu\c{s}u} @@ -705,8 +720,8 @@ } else { int mid = (l + r) / 2; - build(ind * 2,l,mid); - build(ind * 2 + 1,mid + 1,r); + build(ind * 2, l, mid); + build(ind * 2 + 1, mid + 1, r); // [l,r] araliginin cevabini // [l,mid] ve [mid + 1,r] araliklarinin cevaplarinin birlesmesiyle olusur. tree[ind] = tree[ind * 2] + tree[ind * 2 + 1]; @@ -731,7 +746,7 @@ \end{itemize} A\u{g}ac{\i}n her derinli\u{g}inde cevab{\i}m{\i}z i\c{c}in gerekli aral{\i}klardan maksimum 2 - adet bulunabilir. Bu y\"{u}zden sorgu algoritmas{\i} $O(logN)$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda \c{c}al{\i}\c{s}{\i}r. + adet bulunabilir. Bu y\"{u}zden sorgu algoritmas{\i} $O(\log(N))$ zaman karma\c{s}{\i}kl{\i}\u{g}{\i}nda \c{c}al{\i}\c{s}{\i}r. \clearpage @@ -750,7 +765,7 @@ // [lw,rw] sorguda cevabini aradigimiz aralik. // [l,r] ise agactaki ind nolu node'da cevabini sakladigimiz aralik. -int query(int ind,int l,int r,int lw,int rw) { +int query(int ind, int l, int r, int lw, int rw) { if (l > rw or r < lw) //bulundugumuz aralik cevabini aradigimiz araligin disinda. return 0; if (l >= lw and r <= rw) //cevabini aradigimiz aralik bu araligi tamamen kapsiyor. @@ -758,7 +773,8 @@ int mid = (l + r) / 2; //Agacta recursive birseklide araligimizi // araliklara bolup gelen cevaplari birlestiyoruz. - return query(ind * 2,l,mid,lw,rw) + query(ind * 2 + 1,mid + 1,r,lw,rw); + return query(ind * 2, l, mid, lw, rw) + + query(ind * 2 + 1, mid + 1, r, lw, rw); } \end{minted} @@ -768,10 +784,10 @@ Dizideki $x$ indeksli eleman{\i}n{\i}n de\u{g}erini g\"{u}ncellemek i\c{c}in kullan{\i}lan algoritma \c{s}u \c{s}eklide \c{c}al{\i}\c{s}{\i}r. \begin{itemize} - \item A\u{g}a\c{c}ta $x$ indeksli eleman{\i} i\c{c}eren tum d\"{u}\u{g}\"umlerin de\u{g}erlerini g\"{u}ncelle. + \item A\u{g}a\c{c}ta $x$ indeksli eleman{\i} i\c{c}eren t\"{u}m d\"{u}\u{g}\"umlerin de\u{g}erlerini g\"{u}ncelle. \end{itemize} - A\u{g}a\c{c}ta x indeksli eleman{\i}n cevab{\i}n{\i} tutan yaprak d\"{u}\u{g}\"{u}mden root d\"{u}\u{g}\"{u}me kadar toplamda $logN$ d\"{u}\u{g}\"{u}m\"{u}n de\u{g}erini g\"{u}ncellememiz yeterlidir. Dolay{\i}s{\i}yla herhangi bir eleman{\i}n de\u{g}erini g\"{u}ncellemenin zaman karma\c{s}{\i}kl{\i}\u{g}{\i} $O(logN)$'dir. + A\u{g}a\c{c}ta x indeksli eleman{\i}n cevab{\i}n{\i} tutan yaprak d\"{u}\u{g}\"{u}mden root d\"{u}\u{g}\"{u}me kadar toplamda $\log(N)$ d\"{u}\u{g}\"{u}m\"{u}n de\u{g}erini g\"{u}ncellememiz yeterlidir. Dolay{\i}s{\i}yla herhangi bir eleman{\i}n de\u{g}erini g\"{u}ncellemenin zaman karma\c{s}{\i}kl{\i}\u{g}{\i} $O(\log(N))$'dir. \begin{figure}[h] \centering @@ -872,6 +888,10 @@ \bibitem{16} https://visualgo.net/en/list + + \bibitem{17} + + https://cp-algorithms.com/data\_structures/fenwick.html \end{thebibliography} diff --git a/bundles/06-data-structures-1/latex/fenwick.png b/bundles/06-data-structures-1/latex/fenwick.png new file mode 100644 index 0000000..06d5051 Binary files /dev/null and b/bundles/06-data-structures-1/latex/fenwick.png differ diff --git a/bundles/11-graph-3/README.md b/bundles/11-graph-3/README.md index 9810652..3db0f50 100644 --- a/bundles/11-graph-3/README.md +++ b/bundles/11-graph-3/README.md @@ -1,7 +1,7 @@ Onsite Contest -------------- -TBA +* https://www.hackerrank.com/inzva-11-graph-3-onsite-2019 Online Contest -------------- -TBA \ No newline at end of file +* https://www.hackerrank.com/inzva-11-graph-3-online-2019 diff --git a/bundles/12-math-3/Readme.md b/bundles/12-math-3/Readme.md index 8b13789..11c3afb 100644 --- a/bundles/12-math-3/Readme.md +++ b/bundles/12-math-3/Readme.md @@ -1 +1,3 @@ - +Online Contest +-------------- +* https://www.hackerrank.com/inzva-12-math-3-online-2019 diff --git a/bundles/README.md b/bundles/README.md new file mode 100644 index 0000000..2fb0cbc --- /dev/null +++ b/bundles/README.md @@ -0,0 +1,19 @@ +**BUNDLES** + +| Name | Topics | +|------|-------| +| [01-Intro](https://github.com/inzva/Algorithm-Program/tree/master/bundles/01-intro) | Big O Notation, Recursion, Builtin Data Structures| +| [02-Algorithms-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/02-algorithms-1) | Binary Search, Ternary Search, Sorting Algorithms, Quickselect, Divide and Conquer| +| [03-Math-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/03-math-1) | Number Theory, Sieve of Eratosthenes, Inverse Modular, GCD, LCM, Factorization, Combinatorics, Exponentiation, Meet in the Middle| +| [04-Graph-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/04-graph-1) | Representing Graphs, Tree Traversals (Preorder, Inorder, Postorder), Binary Search Tree, DFS, BFS, Union Find (DSU), Heap| +| [05-DP-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/05-dp-1) | Greedy Algorithms, Dynamic Programming, Memoization, Knapsack, Coin Problem, LCS, LIS| +| [06-Data-Structures-1](https://github.com/inzva/Algorithm-Program/tree/master/bundles/06-data-structures-1) | Stack, Queue, Deque, Linked List, Prefix Sum, Sparse Table, Binary Indexed Tree, SQRT Decomposition, Segment Tree| +| [07-Graph-2](https://github.com/inzva/Algorithm-Program/tree/master/bundles/07-graph-2) | Bipartate Checking, Topoligical Sort, Shortest Path (Dijkstra, Floyd-Warshall, Bellman Ford), Minimum Spanning Tree (Prim's, Kruskal's)| +| [08-Data-Structures-2](https://github.com/inzva/Algorithm-Program/tree/master/bundles/08-data-structures-2) | Self Balancing Binary Trees, Treap, AVL Tree, Red Black Tree, Lowest Common Ancestor| +| [09-Data-Structures-3](https://github.com/inzva/Algorithm-Program/tree/master/bundles/09-data-structures-3) | Segment Tree with Lazy Propogation, Binary Search on Segment Tree, Mo's Algorithm, Trie| +| [10-DP-2/](https://github.com/inzva/Algorithm-Program/tree/master/bundles/10-dp-2) | Bitmask DP, DP on Rooted Trees, DP on DAGs, Digit DP, Tree Child-Sibling Notation| +| [11-Graph-3](https://github.com/inzva/Algorithm-Program/tree/master/bundles/11-graph-3) | Bridges and Articulation Points, Strongly Connected Components (SCC), BCC, Cycle Finding, Max Flow| +| [12-Math-3](https://github.com/inzva/Algorithm-Program/tree/master/bundles/12-math-3) | Vector Calculus, Area Calculation, Lines and Planes, Intersection, Convex Hull Problem, Rotating Calipers, Closest Pair Problem| +| [13-graph-5](https://github.com/inzva/Algorithm-Program/tree/master/bundles/13-graph-5) | Segment Tree on a Tree, Heavy-Light Decomposition, Centroid Decomposition of a Tree, Subtrees' Set-Swap Technique| +| [14-Algorithms-5](https://github.com/inzva/Algorithm-Program/tree/master/bundles/14-Algorithms-5) |String Matching Algorithms: KMP, Robin-Karp Algorithm, Suffix Array, Longest Common Prefix Array| + diff --git a/docs/algorithms/img/binary_search.png b/docs/algorithms/img/binary_search.png new file mode 100644 index 0000000..409349f Binary files /dev/null and b/docs/algorithms/img/binary_search.png differ diff --git a/docs/algorithms/img/divide_and_conquer.png b/docs/algorithms/img/divide_and_conquer.png new file mode 100644 index 0000000..366743a Binary files /dev/null and b/docs/algorithms/img/divide_and_conquer.png differ diff --git a/docs/algorithms/img/inzva-logo.png b/docs/algorithms/img/inzva-logo.png new file mode 100644 index 0000000..c044253 Binary files /dev/null and b/docs/algorithms/img/inzva-logo.png differ diff --git a/docs/algorithms/img/linear_search.png b/docs/algorithms/img/linear_search.png new file mode 100644 index 0000000..78a394f Binary files /dev/null and b/docs/algorithms/img/linear_search.png differ diff --git a/docs/algorithms/img/ternary_search.png b/docs/algorithms/img/ternary_search.png new file mode 100644 index 0000000..ea4cc6b Binary files /dev/null and b/docs/algorithms/img/ternary_search.png differ diff --git a/docs/algorithms/index.md b/docs/algorithms/index.md new file mode 100644 index 0000000..81e3b9f --- /dev/null +++ b/docs/algorithms/index.md @@ -0,0 +1,373 @@ +--- +title: Algorithms +tags: + - Algorithms + - Linear Search + - Binary Search + - Ternary Search + - Sorting Algorithms + - Insertion Sort + - Merge Sort + - Quick Sort + - Radix Sort + - Quickselect Algorithm + - Divide and Conquer +--- + +**Editor:** Kadir Emre Oto + +**Reviewers:** Muhammed Burak Buğrul, Tahsin Enes Kuru + +## Search Algorithms + +It may be necessary to determine if an array or solution set contains a specific data, and we call this finding proccess **searching**. In this article, three most common search algorithms will be discussed: linear search, binary search, and ternary search. + +[This visualization](https://www.cs.usfca.edu/~galles/visualization/Search.html){target="_blank"} may help you understand how the search algorithms work. + +### Linear Search + +Simplest search algorithm is *linear search*, also know as *sequential search*. In this technique, all elements in the collection of the data is checked one by one, if any element matches, algorithm returns the index; otherwise, it returns $-1$. + +Its time complexity is $\mathcal{O}(N)$. + +
+![Example for linear search](img/linear_search.png){ width="90%" } +
Example for linear search
+
+ +```c++ +int linearSearch(int *array, int size, int key) { + for (int i = 0; i < size; i++) + if (array[i] == key) + return i; + return -1; +} +``` + +### Binary Search + +We know linear search is quite a slow algorithm because it compares each element of the set with search key, and there is a high-speed searching technique for **sorted** data instead of linear search, which is **binary search**. After each comparison, the algorithm eliminates half of the data using the sorting property. + +We can also use binary search on increasing functions in the same way. + +#### Procedure + +1. Compare the key with the middle element of the array, +2. If it is a match, return the index of middle. +3. If the key is bigger than the middle, it means that the key must be in the right side of the middle. We can eliminate the left side. +4. If the key is smaller, it should be on the left side. The right side can be ignored. + +#### Complexity + +$$ +\begin{align*} +T(N) &= T\left(\tfrac{N}{2}\right) + \mathcal{O}(1) \\ +T(N) &= \mathcal{O}(\log N) +\end{align*} +$$ + +
+![Example for binary search](img/binary_search.png){ width="90%" } +
Example for binary search
+
+ +```c++ +int binarySearch(int *array, int size, int key) { + int left = 0, right = size, mid; + + while (left < right) { + mid = (left + right) / 2; + + if (array[mid] >= key) + right = mid; + else + left = mid + 1; + } + return array[left] == key ? left : -1; +} +``` + +### Ternary Search + +Suppose that we have a [unimodal](https://www.geeksforgeeks.org/mathematics-unimodal-functions-bimodal-functions/){target="_blank"} function, $f(x)$, on an interval $[l, r]$, and we are asked to find the local minimum or the local maximum value of the function according to the behavior of it. + +There are two types of unimodal functions: + +1. The function, $f(x)$ strictly increases for $x \leq m$, reaches a global maximum at $x = m$, and then strictly decreases for $m \leq x$. There are no other local maxima. + +2. The function, $f(x)$ strictly decreases for $x \leq m$, reaches a global minimum at $x = m$, and then strictly increases for $m \leq x$. There are no other local minima. + +In this document, we will implement the first type of unimodal function, and the second one can be solved using the same logic. + +#### Procedure + +1. Choose any two points $m_1$, and $m_2$ on the interval $[l, r]$, where $l < m_1 < m_2 < r$. +2. If $f(m_1) < f(m_2)$, it means the maxima should be in the interval $[m_1, r]$, so we can ignore the interval $[l, m_1]$, move $l$ to $m_1$ +3. Otherwise, $f(m_1) \geq f(m_2)$, the maxima have to be in the interval $[l, m_2]$, move $r$ to $m_2$ +4. If $r - l < \epsilon$, where $\epsilon$ is a negligible value, stop the algorithm, return $l$. Otherwise turn to the step 1. + +$m_1$ and $m_2$ can be selected by $m_1 = l + \frac{r-l}{3}$ and $m_2 = r - \frac{r-l}{3}$ to avoid increasing the time complexity. + +#### Complexity + +$$ +\begin{align*} +T(N) &= T\left(2 \cdot \tfrac{N}{3}\right) + \mathcal{O}(1) \\ +T(N) &= \mathcal{O}(\log N) +\end{align*} +$$ + +
+![Example for ternary search](img/ternary_search.png) +
Example for ternary search
+
+ +```c++ +double f(double x); + +double ternarySearch(double left, double right, double eps = 1e-7) { + while (right - left > eps) { + double mid1 = left + (right - left) / 3; + double mid2 = right - (right - left) / 3; + + if (f(mid1) < f(mid2)) + left = mid1; + else + right = mid2; + } + return f(left); +} +``` + +## Sorting Algorithms + +Sorting algorithms are used to put the elements of an array in a certain order according to the comparison operator. Numerical order or lexicographical orders are the most common ones, and there are a large number of sorting algorithms, but we discuss four of them: + +- *Insertion Sort* +- *Merge Sort* +- *Quick Sort* +- *Radix Sort* + +For a better understanding, you are strongly recommended to go into [this visualization site](https://visualgo.net/en/sorting){target="_blank"} after reading the topics. + +### Insertion Sort + +Think that you are playing a card game and want to sort them before the game. Your sorting strategy is simple: you have already sorted some part and every time you pick up the next card from unsorted part, you insert it into the correct place in sorted part. After you apply this process to all cards, the whole deck would be sorted. + +This is the basic idea for sorting an array. We assume that the first element of the array is the sorted part, and other elements are in the unsorted part. Now, we choose the leftmost element of the unsorted part, and put it into the sorted part. In this way the left part of the array always remains sorted after every iteration, and when no element is left in the unsorted part, the array will be sorted. + +```c++ +void insertionSort(int *ar, int size) { + for (int i = 1; i < size; i++) + for (int j = i - 1; 0 <= j and ar[j] > ar[j + 1]; j--) + swap(ar[j], ar[j + 1]); +} +``` + +### Merge Sort + +*Merge Sort* is one of the fastest sorting algorithms that uses *Divide and Conquer* paradigm. The algorithm **divides** the array into two halves, solves each part **recursively** using same sorting function and **combines** them in linear time by selecting the smallest value of the arrays every time. + +#### Procedure + +1. If the size of the array is 1, it is sorted already, stop the algorithm (base case), +2. Find the middle point of the array, and split it in two, +3. Do the algorithm for these parts separately from the first step, +4. After the two halves got sorted, merge them in linear time and the array will be sorted. + +#### Complexity + +$$ +\begin{align*} +T(N) &= T\left(\tfrac{N}{2}\right) + \mathcal{O}(N) \\ +T(N) &= \mathcal{O}(N \cdot \log N) +\end{align*} +$$ + +```c++ +void mergeSort(int *ar, int size) { + if (size <= 1) // base case + return; + + mergeSort(ar, size / 2); // divide the array into two almost equal parts + mergeSort(ar + size / 2, size - size / 2); + + int index = 0, left = 0, right = size / 2; // merge them + int *temp = new int[size]; + + while (left < size / 2 or right < size) { + if (right == size or (left < size / 2 and ar[left] < ar[right])) + temp[index++] = ar[left++]; + else + temp[index++] = ar[right++]; + } + for (int i = 0; i < size; i++) + ar[i] = temp[i]; + delete[] temp; +} +``` + +### Quick Sort + +*Quick Sort* is also a *Divide and Conquer* algorithm. The algorithm chooses an element from the array as a pivot and partitions the array around it. Partitioning is arranging the array that satisfies those: the pivot should be put to its correct place, all smaller values should be placed before the pivot, and all greater values should be placed after the pivot. The partitioning can be done in linear time, and after the partitioning, we can use the same sorting function to solve the left part of the pivot and the right part of the pivot recursively. + +If the sellected pivot cannot divide the array uniformly after the partitioning, the time complexity can reach $\mathcal{O}(n ^ 2)$ like insertion sort. To avoid this, the pivot can generally be picked randomly. + +#### Procedure + +1. If the size of the array is $1$, it is sorted already, stop the algorithm (base case), +2. Choose a pivot randomly, +3. For all values in the array, collect smaller values in the left of the array and greater values in the right of array, +4. Move the pivot to the correct place, +5. Repeat the same algorithm for the left partition and the right partition. + +#### Complexity + +$$ +\begin{align*} +T(N) &= T\left(\tfrac{N}{10}\right) + T\left(9 \cdot \tfrac{N}{10}\right) + \mathcal{O}(N) \\ +T(N) &= \mathcal{O}(N \cdot \log N) +\end{align*} +$$ + +```c++ +void quickSort(int *ar, int size) { + if (size <= 1) // base case + return; + + int position = 1; // find the correct place of pivot + swap(ar[0], ar[rand() % size]); + + for (int i = 1; i < size; i++) + if (ar[0] > ar[i]) + swap(ar[i], ar[position++]); + swap(ar[0], ar[position - 1]); + + quickSort(ar, position - 1); + quickSort(ar + position, size - position); +} +``` + +### Radix Sort + +*Quick Sort* and *Merge Sort* are comparison-based sorting algorithms and cannot run better than $\mathcal{O}(N \log N)$. However, *Radix Sort* works in linear time ($\mathcal{O}(N + K)$, where $K$ is $\log(\max(ar))$). + +#### Procedure + +1. For each digit from the least significant to the most, sort the array using *Counting Sort* according to corresponding digit. *Counting Sort* is used for keys between specific range, and it counts the number of elements which have different key values. After counting the number of distict key values, we can determine the position of elements in the array. + +#### Complexity + +$$ +\begin{align*} +T(N) &= \mathcal{O}(N) +\end{align*} +$$ + +```c++ +void radixSort(int *ar, int size, int base = 10) { + int *temp = new int[size]; + int *count = new int[base](); + + // Find the maximum value. + int maxx = ar[0]; + for (int i = 1; i < size; i++) { + if (ar[i] > maxx) { + maxx = ar[i]; + } + } + + for (int e = 1; maxx / e > 0; e *= base) { + memset(count, 0, sizeof(int) * base); + + for (int i = 0; i < size; i++) + count[(ar[i] / e) % base]++; + + for (int i = 1; i < base; i++) + count[i] += count[i - 1]; + + for (int i = size - 1; 0 <= i; i--) + temp[--count[(ar[i] / e) % base]] = ar[i]; + + for (int i = 0; i < size; i++) + ar[i] = temp[i]; + } + + delete[] temp; + delete[] count; +} +``` + + +## Quickselect Algorithm + +*Quickselect* is a selection algorithm that *finds the $k^{th}$ smallest element in an unordered list*. The algorithm is closely related to QuickSort in partitioning stage; however, instead of recurring for both sides, it recurs only for the part that contains the $k^{th}$ smallest element. + +#### Procedure + +1. Choose a pivot randomly, +2. For all values in the array, collect smaller values in the left of the array and greater values in the right of the array, +3. Move the pivot to the correct place, +4. If the current position is equal to $k$, return the value at the position. +5. If the current position is more than $k$, repeat the same algorithm for the left partition. +6. Else, update $k$ and repeat the same algorithm for the right partition. + +#### Complexity + +- In average: $\mathcal{O}(N)$ +- Worst-case: $\mathcal{O}(N^2)$ + +> Note that this algorithm is fast in practice, but has poor worst-case performance, like quicksort. However, it still performs better on average than other algorithms that find the $k^{th}$ smallest element in $\mathcal{O}(n)$ in the worst case. + +```c++ +// This function finds the k-th smallest element in arr within size si. +int QuickSelect(int *arr, int si, int k) { + // Check if k is valid and if arr has no less elements than k. + if (0 < k && k <= si) { + // The quicksort-like partitioning. It is same until we find the index of the pivot. + int ind = 0; + + // Get a random pivot to decrease the chance of getting worst-case scenario. + swap(arr[si - 1], arr[rand() % si]); + for (int j = 0; j < si - 1; j++) { + if (arr[j] <= arr[si - 1]) { + swap(arr[j], arr[ind]); + ind++; + } + } + swap(arr[si - 1], arr[ind]); + + // Now check and recur to appropriate situation. + // If the index is equal with k-1 (as our array is 0-indexed) return the value. + if (ind == k - 1) { + return arr[ind]; + } + // Else check if index is greater than k-1. If it is, recur to the left part. + else if (ind > k - 1) { + return QuickSelect(arr, ind, k); + } + // Else, recur to the right part. + else { + return QuickSelect(arr + ind + 1, si - ind - 1, k - ind - 1); + } + } + // If invalid values is given + return INT_MAX; +} +``` + +## Divide and Conquer + +*Divide and Conquer* is a well-known paradigm that **breaks** up the problem into several parts, **solves** each part independently, and finally **combines** the solutions to the subproblems into the overall solution. Because each subproblem is solved recursively, they should be the smaller versions of the original problem; and the problem must have a base case to end the recursion. + +Some example algorithms that use divide and conquer technique: + +- Merge Sort +- Count Inversions +- Finding the Closest Pair of Points +- [Others](https://www.geeksforgeeks.org/divide-and-conquer){target="_blank"} + +
+![The Flow of *Divide and Conquer*](img/divide_and_conquer.png) +
The Flow of Divide and Conquer
+
\ No newline at end of file diff --git a/docs/data-structures/deque.md b/docs/data-structures/deque.md new file mode 100644 index 0000000..dc29c40 --- /dev/null +++ b/docs/data-structures/deque.md @@ -0,0 +1,31 @@ +--- +title: Deque +tags: + - Data Structures + - Deque +--- + +Deque veri yapısı stack ve queue veri yapılarına göre daha kapsamlıdır. Bu veri yapısında yapının en üstüne eleman eklenebilirken aynı zamanda en altına da eklenebilir. Aynı şekilde yapının hem en üstündeki elemanına hem de en alttaki elemanına erişim ve silme işlemleri uygulanabilir. Bu veri yapısında uyguluyabildiğimiz işlemler: + +- Veri yapısının en üstüne eleman ekleme. +- Veri yapısının en altına eleman ekleme. +- Veri yapısının en üstündeki elemanına erişim. +- Veri yapısının en altındaki elemanına erişim. +- Veri yapısının en üstündeki elemanı silme. +- Veri yapısının en altındaki elemanı silme. + +C++ dilindeki STL kütüphanesinde bulunan hazır deque yapısının kullanımı aşağıdaki gibidir: + +```c++ +int main() { + deque q; + q.push_front(5); // deque'nin en altina 5'i ekler. + q.push_back(6); // deque'nin en ustune 6'yi ekler. + int x = q.front(); // deque'nin en altindaki elemanina erisim. + int y = q.back(); // deque'nin en ustundeki elemanina erisim. + q.pop_front(); // deque'nin en altindaki elemanini silme. + q.pop_back(); // deque'nin en ustundeki elemanini silme. +} +``` + +**P.S.** deque veri yapısı stack ve queue veri yapılarına göre daha kapsamlı olduğundan ötürü stack ve queue veri yapılarına göre 2 kat fazla memory kullandığını açıklıkla söyleyebiliriz. diff --git a/docs/data-structures/fenwick-tree.md b/docs/data-structures/fenwick-tree.md new file mode 100644 index 0000000..92f9dfc --- /dev/null +++ b/docs/data-structures/fenwick-tree.md @@ -0,0 +1,120 @@ +--- +title: Fenwick Tree +tags: + - Data Structures + - Fenwick Tree + - Binary Indexed Tree + - BIT +--- + +Binary Indexed Tree olarak da bilinen Fenwick Tree, [Prefix Sum](prefix-sum.md) ve [Sparse Table](sparse-table.md) yapılarına benzer bir yapıda olup dizi üzerinde değişiklik yapabilmemize olanak sağlayan bir veri yapısıdır. Fenwick Tree'nin diğer veri yapılarına göre en büyük avantajı pratikte daha hızlı olması ve hafıza karmaşıklığının $\mathcal{O}(N)$ olmasıdır. Ancak Fenwick Tree'de sadece prefix cevapları (veya suffix cevapları) saklayabildiğimizden aralıklarda minimum, maksimum ve EBOB gibi bazı sorguların cevaplarını elde edemeyiz. + +## Yapısı ve Kuruluşu + +$g(x)$, $x$ sayısının bit gösteriminde yalnızca en sağdaki bitin 1 olduğu tam sayı olsun. Örneğin $20$'nin bit gösterimi $(10100)_2$ olduğundan $g(20)=4$'tür. Çünkü ilk kez sağdan $3.$ bit $1$'dir ve $(00100)_2=4$'tür. Fenwick Tree'nin $x$ indeksli düğümünde, $x - g(x) + 1$ indeksli elemandan $x$ indeksli elemana kadar olan aralığın cevabını saklayacak şekilde kurulur. + +
+![8 uzunluğundaki bir dizi için kurulmuş Fenwick Tree yapısı](img/fenwick.png){ width="80%" } +
$8$ uzunluğundaki bir dizi için kurulmuş Fenwick Tree yapısı
+
+ + +## Sorgu Algoritması + +Herhangi bir $[1,x]$ aralığı için sorgu algoritması sırası ile şu şeklide çalışır: + +1. Aradığımız cevaba $[x - g(x) + 1,x]$ aralığının cevabını ekle. +2. $x$'in değerini $x - g(x)$ yap. Eğer $x$'in yeni değeri $0$'dan büyük ise $1.$ işlemden hesaplamaya devam et. + +$[1,x]$ aralığının cevabını hesaplamak için yapılan işlem sayısı $x$ sayısının $2$'lik tabandaki yazılışındaki $1$ sayısına eşittir. Çünkü her döngüde $x$'ten $2$'lik tabandaki yazılışındaki en sağdaki $1$ bitini çıkartıyoruz. Dolayısıyla sorgu işlemimiz $\mathcal{O}(\log N)$ zaman karmaşıklığında çalışır. $[l,r]$ aralığının cevabını da $[1,r]$ aralığının cevabından $[1,l - 1]$ aralığının cevabını çıkararak kolay bir şekilde elde edebiliriz. + +> NOT: $g(x)$ değerini bitwise operatörlerini kullanarak aşağıdaki eşitlikle kolay bir şekilde hesaplayabiliriz: +> \\[g(x) = x \ \& \ (-x)\\] + +## Eleman Güncelleme Algoritması + +Dizideki $x$ indeksli elemanının değerini güncellemek için kullanılan algoritma şu şeklide çalışır: + +- Ağaçta $x$ indeksli elemanı içeren tüm düğümlerin değerlerini güncelle. + +Fenwick Tree'de $x$ indeksli elemanı içeren maksimum $\log(N)$ tane aralık olduğundan güncelleme algoritması $\mathcal{O}(\log N)$ zaman karmaşıklığında çalışır. + +## Örnek Kod Parçaları + +```c++ +const int n; +int tree[n + 1], a[n + 1]; + +void add(int val, int x) { // x indeksli elemanin degerini val degeri kadar artirir. + // x indeksinin etkiledigi butun dugumleri val degeri kadar artirir. + while (x <= n) { + tree[x] += val; + x += x & (-x); + } +} + +int sum(int x) { // 1 indeksli elemandan x indeksli elemana + int res = 0; // kadar olan sayilarin toplamini verir. + while (x >= 1) { + res += tree[x]; + x -= x & (-x); + } + return res; +} + +int query(int l, int r) { // [l,r] araligindaki elemanlarin toplamini verir. + return sum(r) - sum(l - 1); +} + +void build() { // a dizisi uzerine fenwick tree yapisini kuruyoruz. + for (int i = 1; i <= n; i++) + add(a[i], i); +} +``` + +Fenwick Tree veri yapısı ile ilgili örnek bir probleme [buradan](https://www.spoj.com/problems/CSUMQ) ulaşabilirsiniz. + +## Aralık Güncelleme ve Eleman Sorgu + +Bir $a$ dizisi üzerinde işlemler yapacağımızı varsayalım daha sonra $a$ dizisi $b$ dizisinin prefix sum dizisi olacak şekilde bir $b$ dizisi tanımlayalım. Başka bir deyişle $a_i = \displaystyle\sum_{j=1}^{i} {b_j} $ olmalıdır. Sonradan oluşturduğumuz $b$ dizisi üzerine Fenwick Tree yapısını kuralım. $[l,r]$ aralığındaki her elemana +$x$ değerini eklememiz için uygulamamız gereken işlemler: + +- $b_l$ değerini $x$ kadar artır. Böylelikle $l$ indeksli elemandan dizinin sonuna kadar tüm elemanların değeri $x$ kadar artmış olur. +- $b_{r + 1}$ değerini $x$ kadar azalt. Böylelikle $r + 1$ indeksli elemandan dizinin sonuna kadar tüm elemanların değeri $x$ kadar azalmış olur. Bu işlemelerin sonucunda sadece $[l,r]$ aralığındaki elemanların değeri $x$ kadar artmış olur. + +### Örnek Kod Parçaları + +```c++ +const int n; +int a[n + 1], b[n + 1]; + +void add(int val, int x) { // x indeksli elemanin degerini val degeri kadar artirir. + while (x <= n) { + tree[x] += val; + x += x & (-x); + } +} + +int sum(int x) { // 1 indeksli elemandan x indeksli elemana + int res = 0; // kadar olan sayilarin toplamini verir. + while (x >= 1) { + res += tree[x]; + x -= x & (-x); + } + return res; +} +void build() { + for (int i = 1; i <= n; i++) + b[i] = a[i] - a[i - 1]; // b dizisini olusturuyoruz. + + for (int i = 1; i <= n; i++) + add(b[i], i); // b dizisi uzerine fenwick tree kuruyoruz. +} + +void update(int l, int r, int x) { + add(x, l); + add(-x, r + 1); +} + +void query(int x) { return sum(x); } +``` diff --git a/docs/data-structures/img/fenwick.png b/docs/data-structures/img/fenwick.png new file mode 100644 index 0000000..06d5051 Binary files /dev/null and b/docs/data-structures/img/fenwick.png differ diff --git a/docs/data-structures/img/linkedlist.png b/docs/data-structures/img/linkedlist.png new file mode 100644 index 0000000..8ee11a0 Binary files /dev/null and b/docs/data-structures/img/linkedlist.png differ diff --git a/docs/data-structures/img/mo.png b/docs/data-structures/img/mo.png new file mode 100644 index 0000000..8166582 Binary files /dev/null and b/docs/data-structures/img/mo.png differ diff --git a/docs/data-structures/img/naive_update.png b/docs/data-structures/img/naive_update.png new file mode 100644 index 0000000..60da873 Binary files /dev/null and b/docs/data-structures/img/naive_update.png differ diff --git a/docs/data-structures/img/query_soln.png b/docs/data-structures/img/query_soln.png new file mode 100644 index 0000000..b3733ae Binary files /dev/null and b/docs/data-structures/img/query_soln.png differ diff --git a/docs/data-structures/img/segtree.png b/docs/data-structures/img/segtree.png new file mode 100644 index 0000000..8fdf75b Binary files /dev/null and b/docs/data-structures/img/segtree.png differ diff --git a/docs/data-structures/img/segtreequery.png b/docs/data-structures/img/segtreequery.png new file mode 100644 index 0000000..895f582 Binary files /dev/null and b/docs/data-structures/img/segtreequery.png differ diff --git a/docs/data-structures/img/segtreeupdate.png b/docs/data-structures/img/segtreeupdate.png new file mode 100644 index 0000000..47edc23 Binary files /dev/null and b/docs/data-structures/img/segtreeupdate.png differ diff --git a/docs/data-structures/img/trie.png b/docs/data-structures/img/trie.png new file mode 100644 index 0000000..3c2eb47 Binary files /dev/null and b/docs/data-structures/img/trie.png differ diff --git a/docs/data-structures/img/updated_segtree.png b/docs/data-structures/img/updated_segtree.png new file mode 100644 index 0000000..a39aabe Binary files /dev/null and b/docs/data-structures/img/updated_segtree.png differ diff --git a/docs/data-structures/index.md b/docs/data-structures/index.md new file mode 100644 index 0000000..fc0e927 --- /dev/null +++ b/docs/data-structures/index.md @@ -0,0 +1,66 @@ +--- +title: Data Structures +tags: + - Data Structures +--- + +**Editor:** Tahsin Enes Kuru + +**Reviewers:** Baha Eren Yaldız, Burak Buğrul + +**Contributors:** Kerim Kochekov + +## Giriş + +Bilgisayar biliminde veri yapıları, belirli bir eleman kümesi üzerinde verimli bir şeklide bilgi edinmemize aynı zamanda bu elemanlar üzerinde değişiklikler yapabilmemize olanak sağlayan yapılardır. Çalışma prensipleri genellikle elemanların değerlerini belirli bir kurala göre saklamak daha sonra bu yapıları kullanarak elemanlar hakkında sorulara (mesela, bir dizinin belirli bir aralığındaki en küçük sayıyı bulmak gibi) cevap aramaktır. + +## Dinamik Veri Yapıları + +### [Linked List](linked-list.md) +### [Stack](stack.md) +### [Queue](queue.md) +### [Deque](deque.md) +### [Fenwick Tree](fenwick-tree.md) +### [Segment Tree](segment-tree.md) +### [Trie](trie.md) + +## Statik Veri Yapıları + +### [Prefix Sum](prefix-sum.md) +### [Sparse Table](sparse-table.md) +### [SQRT Decomposition](sqrt-decomposition.md) +### [Mo's Algorithm](mo-algorithm.md) + +## Common Problems + +### [LCA](lowest-common-ancestor.md) + +## Örnek Problemler + +Veri yapıları üzerinde pratik yapabilmeniz için önerilen problemler: + +1. [Link](https://codeforces.com/problemset/problem/797/C){target="_blank"} +2. [Link](https://codeforces.com/contest/276/problem/C){target="_blank"} +3. [Link](https://codeforces.com/contest/380/problem/C){target="_blank"} +4. [Link](https://www.hackerearth.com/problem/algorithm/benny-and-sum-2){target="_blank"} +5. [Link](https://www.hackerearth.com/practice/data-structures/advanced-data-structures/fenwick-binary-indexed-trees/practice-problems/algorithm/counting-in-byteland){target="_blank"} + +## Faydalı Bağlantılar + +1. {target="_blank"} +2. {target="_blank"} +3. {target="_blank"} +4. {target="_blank"} +5. {target="_blank"} +6. {target="_blank"} +7. {target="_blank"} +8. {target="_blank"} +9. {target="_blank"} +10. {target="_blank"} +11. {target="_blank"} +12. {target="_blank"} +13. {target="_blank"} +14. {target="_blank"} +15. {target="_blank"} +16. {target="_blank"} +17. {target="_blank"} diff --git a/docs/data-structures/linked-list.md b/docs/data-structures/linked-list.md new file mode 100644 index 0000000..2f9a203 --- /dev/null +++ b/docs/data-structures/linked-list.md @@ -0,0 +1,53 @@ +--- +title: Linked List +tags: + - Data Structures + - Linked List +--- + +Linked List veri yapısında elemanlar, her eleman kendi değerini ve bir sonraki elemanın adresini tutacak şekilde saklanır. Yapıdaki elemanlar baş elemandan (head) başlanarak son elemana (tail) gidecek şekilde gezilebilir. Diziye karşın avantajı hafızanın dinamik bir şekilde kullanılmasıdır. Bu veri yapısında uygulanabilecek işlemler: + +- Veri yapısının sonuna eleman ekleme. +- Anlık veri yapısını baştan (head) sona (tail) gezme. + +
+![Örnek bir Linked List yapısı](img/linkedlist.png){ width="100%" } +
Örnek bir Linked List yapısı
+
+ +```c++ +// Her bir elemani (burada sayilari, yani int) tutacak struct olusturuyoruz. +struct node { + int data; + node *next; +}; +node *head, *tail; + +void push_back(int x) { + // Yeni elemanimizi hafizada olusturuyoruz. + node *t = (node *)malloc(sizeof(node)); + t->data = x; // Elemanin verisini atiyoruz. + t->next = NULL; // Sona ekledigimizden sonraki elemanina NULL atiyoruz. + + // Eger veri yapimiza hic eleman eklenmediyse head + // ve tail elemanlarini olusturuyoruz. + if (head == NULL && tail == NULL) { + head = t; + tail = t; + } + // Eklenmisse yeni tail elemanimizi guncelliyoruz. + else { + tail->next = t; + tail = t; + } +} + +void print() { + // Dizideki tum elemanlari geziyoruz. + node *t = head; + while (t != NULL) { + printf("%d ", t->data); + t = t->next; + } +} +``` diff --git a/docs/data-structures/lowest-common-ancestor.md b/docs/data-structures/lowest-common-ancestor.md new file mode 100644 index 0000000..225eeb8 --- /dev/null +++ b/docs/data-structures/lowest-common-ancestor.md @@ -0,0 +1,32 @@ +--- +title: Lowest Common Ancestors +tags: + - Tree + - LCA + - Lowest Common Ancestors + - Binary Lifting +--- + +This problem consists of queries, LCA(x, y), and asks for the ancestor of both x and y whose depth is maximum. We will use a similar algorithm to the jump pointer algorithm with implementation. + +## Initialization + +As we did in Jump Pointer Method, we will calculate node's all $2^i$. ancestors if they exist. L[x][y] corresponds to x's $2^y$. ancestors. Hence L[x][0] is basically the parent of x. + +```cpp +void init() { + for(int x=1 ; x<=n ; x++) + L[x][0] = parent[x]; + + for(int y=1 ; y<=logN ; y++) + for(int x=1 ; x<=n ; x++) + L[x][y] = L[L[x][y-1]][y-1]; +} +``` +Note that we have used the fact that x's $2^y$. ancestor is x's $2^{y−1}$. ancestor's $2^{y−1}$. ancestor. + +## Queries-Binary Lifting + +Given LCA(x, y), we calculate answer by following: + +Firstly, ensure that both x and y are in same depth. If it is not take the deepest one to the other one's depth. Then control whether x and y is equal. If they are equal, that means the lowest common ancestor is x. After that, from i = log(N), check that if x's $2^i$. ancestor is equal to y's $2^i$. ancestor. If they are not equal that means LCA is somewhere above the $2^i$. ancestors of x and y. Then we continue to search LCA of y and x’s ancestors as LCA(L[x][i], L[y][i]) is the same as LCA(x, y). Please notice that we have ensured that depth difference between LCA and both x and y are no longer larger than $2^i$. If we apply this producure until i = 0, we would left with x and y such that parent of x is LCA. Of course, the parent of y would also be LCA. diff --git a/docs/data-structures/mo-algorithm.md b/docs/data-structures/mo-algorithm.md new file mode 100644 index 0000000..3eb4ce4 --- /dev/null +++ b/docs/data-structures/mo-algorithm.md @@ -0,0 +1,114 @@ +--- +title: Mo's Algorithm +tags: + - Data Structures + - Mo's Algorithm +--- + +This method will be a key for solving offline range queries on an array. By offline, we mean we can find the answers of these queries in any order we want and there are no updates. Let’s introduce a problem and construct an efficient solution for it. + +You have an array a with $N$ elements such that it’s elements ranges from $1$ to $M$. You have to answer $Q$ queries. Each is in the same type. You will be given a range $[l, r]$ for each query, you have to print how many different values are there in the subarray $[a_l , a_{l+1}..a_{r−1}, a_r]$. + +First let’s find a naive solution and improve it. Remember the frequency array we mentioned before. We will keep a frequency array that contains only given subarray’s values. Number of values in this frequency array bigger than 0 will be our answer for given query. Then we have to update frequency array for next query. We will use $\mathcal{O}(N)$ time for each query, so total complexity will be $\mathcal{O}(Q \times N)$. Look at the code below for implementation. + +```cpp +class Query { + public: + int l, r, ind; + Query(int l, int r, int ind) { + this->l = l, this->r = r, this->ind = ind; + } +}; + +void del(int ind, vector &a, vector &F, int &num) { + if (F[a[ind]] == 1) num--; + F[a[ind]]--; +} + +void add(int ind, vector &a, vector &F, int &num) { + if (F[a[ind]] == 0) num++; + F[a[ind]]++; +} + +vector solve(vector &a, vector &q) { + int Q = q.size(), N = a.size(); + int M = *max_element(a.begin(), a.end()); + vector F(M + 1, 0); // This is frequency array we mentioned before + vector ans(Q, 0); + int l = 0, r = -1, num = 0; + for (int i = 0; i < Q; i++) { + int nl = q[i].l, nr = q[i].r; + while (l < nl) del(l++, a, F, num); + while (l > nl) add(--l, a, F, num); + while (r > nr) del(r--, a, F, num); + while (r < nr) add(++r, a, F, num); + ans[q[i].ind] = num; + } + return ans; +} +``` + +Time complexity for each query here is $\mathcal{O}(N)$. So total complexity is $\mathcal{O}(Q \times N)$. Just by changing the order of queries we will reduce this complexity to $\mathcal{O}((Q + N) \times \sqrt N)$. + +## Mo's Algorithm + +We will change the order of answering the queries such that overall complexity will be reduced drastically. We will use following cmp function to sort our queries and will answer them in this sorted order. Block size here is $\mathcal{O}(\sqrt N)$. + +```cpp +bool operator<(Query other) const { + return make_pair(l / block_size, r) < + make_pair(other.l / block_size, other.r); +} +``` + +Why does that work? Let’s examine what we do here first then find the complexity. We divide $l$'s of queries into blocks. Block number of a given $l$ is $l$ blocksize (integer division). We sort the queries first by their block numbers then for same block numbers, we sort them by their $r$'s. Sorting all queries will take $\mathcal{O}(Q \times log{Q})$ time. Let’s look at how many times we will call add and del operations to change current $r$. For the same block $r$'s always increases. So for same block it is $\mathcal{O}(N)$ since it can only increase. Since there are $N$ blocksize blocks in total, it will be $\mathcal{O}(N \times N / \text{block\_size})$ operations in total. For same block, add and del operations that changes $l$ will be called at most $\mathcal{O}(\text{block\_size})$ times for each query, since if block number is same then their $l$'s must differ at most by $\mathcal{O}(\text{block\_size})$. So overall it is $\mathcal{O}(Q \times \text{block\_size})$. Also when consecutive queries has different block numbers we will perform at most $\mathcal{O}(N)$ operations, but notice that there are at most $\mathcal{O}(N \div \text{block\_size})$ such consecutive queries, so it doesn't change the overall time complexity. If we pick $block\_size = \sqrt N$ overall complexity will be $\mathcal{O}((Q + N) \times \sqrt N)$. Full code is given below. + +
+![Example for the Algorithm](img/mo.png) +
Example for the Algorithm
+
+ +```cpp +int block_size; + +class Query { + public: + int l, r, ind; + Query(int l, int r, int ind) { + this->l = l, this->r = r, this->ind = ind; + } + bool operator<(Query other) const { + return make_pair(l / block_size, r) < + make_pair(other.l / block_size, other.r); + } +}; + +void del(int ind, vector &a, vector &F, int &num) { + if (F[a[ind]] == 1) num--; + F[a[ind]]--; +} + +void add(int ind, vector &a, vector &F, int &num) { + if (F[a[ind]] == 0) num++; + F[a[ind]]++; +} + +vector solve(vector &a, vector &q) { + int Q = q.size(), N = a.size(); + int M = *max_element(a.begin(), a.end()); + block_size = sqrt(N); + sort(q.begin(), q.end()); + vector F(M + 1, 0); // This is frequency array we mentioned before + vector ans(Q, 0); + int l = 0, r = -1, num = 0; + for (int i = 0; i < Q; i++) { + int nl = q[i].l, nr = q[i].r; + while (l < nl) del(l++, a, F, num); + while (l > nl) add(--l, a, F, num); + while (r > nr) del(r--, a, F, num); + while (r < nr) add(++r, a, F, num); + ans[q[i].ind] = num; + } + return ans; +} +``` diff --git a/docs/data-structures/prefix-sum.md b/docs/data-structures/prefix-sum.md new file mode 100644 index 0000000..1adc190 --- /dev/null +++ b/docs/data-structures/prefix-sum.md @@ -0,0 +1,53 @@ +--- +title: Prefix Sum +tags: + - Data Structures + - Prefix Sum +--- + +Prefix Sum dizisi bir dizinin prefixlerinin toplamlarıyla oluşturulan bir veri yapısıdır. Prefix sum dizisinin $i$ indeksli elemanı girdi dizisindeki $1$ indeksli elemandan $i$ indeksli elemana kadar olan elemanların toplamına eşit olacak şekilde kurulur. Başka bir deyişle: + +$$sum_i = \sum_{j=1}^{i} {a_j}$$ + +Örnek bir $A$ dizisi için prefix sum dizisi şu şekilde kurulmalıdır: + +
+| **A Dizisi** | $4$ | $6$ | $3$ | $12$ | $1$ | +|-------------------:|:---:|:-----:|:-------:|:----------:|:------------:| +| **Prefix Sum Dizisi** | $4$ | $10$ | $13$ | $25$ | $26$ | +| | $4$ | $4+6$ | $4+6+3$ | $4+6+3+12$ | $4+6+3+12+1$ | +
+ +Prefix sum dizisini kullanarak herhangi bir $[l,r]$ aralığındaki elemanların toplamını şu şekilde kolaylıkla elde edebiliriz: + +$$sum_r = \sum_{j=1}^{r} {a_j}$$ + +$$sum_{l - 1} = \sum_{j=1}^{l - 1} {a_j}$$ + +$$sum_r - sum_{l-1} = \sum_{j=l}^{r} {a_j}$$ + +## Örnek Kod Parçaları + +Prefix Sum dizisini kurarken $sum_i = sum_{i - 1} + a_i$ eşitliği kolayca görülebilir ve bu eşitliği kullanarak $sum[]$ dizisini girdi dizisindeki elemanları sırayla gezerek kurabiliriz: + +```c++ +const int n; +int sum[n + 1], a[n + 1]; +// a dizisi girdi dizimiz, sum dizisi de prefix sum dizimiz olsun. + +void build() { + for (int i = 1; i <= n; i++) + sum[i] = sum[i - 1] + a[i]; + return; +} + +int query(int l, int r) { + return sum[r] - sum[l - 1]; +} +``` + +## Zaman Karmaşıklığı + +Prefix sum dizisini kurma işlemimizin zaman ve hafıza karmaşıklığı $\mathcal{O}(N)$. Her sorguya da $\mathcal{O}(1)$ karmaşıklıkta cevap verebiliyoruz. + +Prefix sum veri yapısı ile ilgili örnek bir probleme [buradan](https://codeforces.com/problemset/problem/816/B){target="_blank"} ulaşabilirsiniz. diff --git a/docs/data-structures/queue.md b/docs/data-structures/queue.md new file mode 100644 index 0000000..c772608 --- /dev/null +++ b/docs/data-structures/queue.md @@ -0,0 +1,28 @@ +--- +title: Queue +tags: + - Data Structures + - Queue +--- + +Queue veri yapısında elemanlar yapıya ilk giren ilk çıkar (FIFO) kuralına uygun olacak şekilde saklanır. Bu veri yapısında uygulayabildigimiz işlemler: + +- Veri yapısının en üstüne eleman ekleme. +- Veri yapısının en altındaki elemanına erişim. +- Veri yapısının en altındaki elemanı silme. +- Veri yapısının boş olup olmadığının kontrölü. + +C++ dilindeki STL kütüphanesinde bulunan hazır queue yapısının kullanımı aşağıdaki gibidir: + +```c++ +int main() { + queue q; + cout << q.empty() << endl; // Ilk bashta Queue bosh oldugu icin burada True donecektir. + q.push(5); // Queue'in en ustune 5'i ekler. Queue'in yeni hali: {5} + q.push(7); // Queue'in en ustune 7'yi ekler. Queue'in yeni hali: {7, 5} + q.push(6); // Queue'in en ustune 6'yi ekler. Queue'in yeni hali : {6, 7, 5} + q.pop(); // Queue'in en altindaki elemani siler. Queue'in yeni hali : {6, 7} + q.push(1); // Queue'in en ustune 1'i ekler. Queue'in yeni hali : {1, 6, 7} + cout << Q.front() << endl; // Queue'in en ustundeki elemana erisir. Ekrana 7 yazdirir. +} +``` \ No newline at end of file diff --git a/docs/data-structures/segment-tree.md b/docs/data-structures/segment-tree.md new file mode 100644 index 0000000..4e62f4f --- /dev/null +++ b/docs/data-structures/segment-tree.md @@ -0,0 +1,296 @@ +--- +title: Segment Tree +tags: + - Data Structures + - Segment Tree +--- + +Segment Tree is a data structure that enables us to answer queries like minimum, maximum, sum etc. for any $[l,r]$ interval in $\mathcal{O}(\log N)$ time complexity and update these intervals. + +Segment Tree is more useful than [Fenwick Tree](fenwick-tree.md) and [Sparse Table](sparse-table.md) structures because it allows updates on elements and provides the possibility to answer queries like minimum, maximum etc. for any $[l,r]$ interval. Also, the memory complexity of Segment Tree is $\mathcal{O}(N)$ while the memory complexity of the Sparse Table structure is $\mathcal{O}(N \log N)$. + +## Structure and Construction +Segment Tree has a "Complete Binary Tree" structure. The leaf nodes of the Segment Tree store the elements of the array, and each internal node's value is calculated with a function that takes its children's values as inputs. Thus, the answers of certain intervals are stored in each node, and the answer of the whole array is stored in the root node. For example, for a Segment Tree structure built for the sum query, the value of each node is equal to the sum of its children's values. + +
+![segment tree structure to query sum on array a = [41,67,6,30,85,43,39]](img/segtree.png){ width="100%" } +
segment tree structure to query sum on array $a = [41,67,6,30,85,43,39]$
+
+ +```c++ +void build(int ind, int l, int r) { + // tree[ind] stores the answer of the interval [l,r] + if (l == r) { // leaf node reached + tree[ind] = a[l]; // store the value of the leaf node + } else { + int mid = (l + r) / 2; + build(ind * 2, l, mid); + build(ind * 2 + 1, mid + 1, r); + // the answer of the interval [l,mid] and [mid + 1,r] is the sum of their answers + tree[ind] = tree[ind * 2] + tree[ind * 2 + 1]; + } +} +``` + +## Query and Update Algorithms + +### Query Algorithm + +For any $[l,r]$ interval, the query algorithm works as follows: +- Divide the $[l,r]$ interval into the widest intervals that are stored in the tree. +- Merge the answers of these intervals to calculate the desired answer. + +There are at most $2$ intervals that are needed to calculate the answer at each depth of the tree. Therefore, the query algorithm works in $\mathcal{O}(\log N)$ time complexity. + +
+![on array a = [41,67,6,30,85,43,39] query at $[2,6]$ interval](img/segtreequery.png){ width="100%" } +
on array $a = [41,67,6,30,85,43,39]$ query at $[2,6]$ interval
+
+ +On array $a = [41,67,6,30,85,43,39]$, the answer of the $[2,6]$ interval is obtained by merging the answers of the $[2,3]$ and $[4,6]$ intervals. The answer for the sum query is calculated as $36+167=203$. + +```c++ +// [lw,rw] is the interval we are looking for the answer +// [l,r] is the interval that the current node stores the answer +int query(int ind, int l, int r, int lw, int rw) { + if (l > rw or r < lw) //current interval does not contain the interval we are looking for + return 0; + if (l >= lw and r <= rw) //current interval is completely inside the interval we are looking for + return tree[ind]; + + int mid = (l + r) / 2; + + // recursively calculate the answers of all intervals containing the x index + return query(ind * 2, l, mid, lw, rw) + query(ind * 2 + 1, mid + 1, r, lw, rw); +} +``` + +### Update Algorithm + +Update the value of every node that contains $x$ indexed element. + +It is sufficient to update the values of at most $\log(N)$ nodes from the leaf node containing the $x$ indexed element to the root node. Therefore, the time complexity of updating the value of any element is $\mathcal{O}(\log N)$. + +
+![the nodes that should be updated when updating the $5^{th}$ index of the array a = [41,67,6,30,85,43,39] are as follows:](img/segtreeupdate.png){ width="100%" } +
the nodes that should be updated when updating the $5^{th}$ index of the array $a = [41,67,6,30,85,43,39]$ are as follows:
+
+ + + +```c++ +void update(int ind, int l, int r, int x, int val) { + if (l > x || r < x) // x index is not in the current interval + return; + if (l == x and r == x) { + tree[ind] = val; // update the value of the leaf node + return; + } + + int mid = (l + r) / 2; + + // recursively update the values of all nodes containing the x index + update(ind * 2, l, mid, x, val); + update(ind * 2 + 1, mid + 1, r, x, val); + tree[ind] = tree[ind * 2] + tree[ind * 2 + 1]; +} +``` + +A sample problem related to the Segment Tree data structure can be found [here](https://codeforces.com/gym/100739/problem/A){target="_blank"}. + +## Segment Tree with Lazy Propagation +Previously, update function was called to update only a single value in array. Please note that a single value update in array may cause changes in multiple nodes in Segment Tree as there may be many segment tree nodes that have this changed single element in it’s range. + +### Lazy Propogation Algorithm +We need a structure that can perform following operations on an array $[1,N]$. + +- Add inc to all elements in the given range $[l, r]$. +- Return the sum of all elements in the given range $[l, r]$. + +Notice that if update was for single element, we could use the segment tree we have learned before. Trivial structure comes to mind is to use an array and do the operations by traversing and increasing the elements one by one. Both operations would take $\mathcal{O}(L)$ time complexity in this structure where $L$ is the number of elements in the given range. + +Let’s use segment tree’s we have learned. Second operation is easy, We can do it in $\mathcal{O}(\log N)$. What about the first operation. Since we can do only single element update in the regular segment tree, we have to update all elements in the given range one by one. Thus we have to perform update operation $L$ times. This works in $\mathcal{O}(L \times \log N)$ for each range update. This looks bad, even worse than just using an array in a lot of cases. + +So we need a better structure. People developed a trick called lazy propagation to perform range updates on a structure that can perform single update (This trick can be used in segment trees, treaps, k-d trees ...). + +Trick is to be lazy i.e, do work only when needed. Do the updates only when you have to. Using Lazy Propagation we can do range updates in $\mathcal{O}(\log N)$ on standart segment tree. This is definitely fast enough. + +### Updates Using Lazy Propogation +Let’s be lazy as told, when we need to update an interval, we will update a node and mark its children that it needs to be updated and update them when needed. For this we need an array $lazy[]$ of the same size as that of segment tree. Initially all the elements of the $lazy[]$ array will be $0$ representing that there is no pending update. If there is non-zero element $lazy[k]$ then this element needs to update node k in the segment tree before making any query operation, then $lazy[2\cdot k]$ and $lazy[2 \cdot k + 1]$ must be also updated correspondingly. + +To update an interval we will keep 3 things in mind. + +- If current segment tree node has any pending update, then first add that pending update to current node and push the update to it’s children. +- If the interval represented by current node lies completely in the interval to update, then update the current node and update the $lazy[]$ array for children nodes. +- If the interval represented by current node overlaps with the interval to update, then update the nodes as the earlier update function. + +```c++ +void update(int node, int start, int end, int l, int r, int val) { + // If there's a pending update on the current node, apply it + if (lazy[node] != 0) { + tree[node] += (end - start + 1) * lazy[node]; // Apply the pending update + // If not a leaf node, propagate the lazy update to the children + if (start != end) { + lazy[2 * node] += lazy[node]; + lazy[2 * node + 1] += lazy[node]; + } + lazy[node] = 0; // Clear the pending update + } + + // If the current interval [start, end] does not intersect with [l, r], return + if (start > r || end < l) { + return; + } + + // If the current interval [start, end] is completely within [l, r], apply the update + if (l <= start && end <= r) { + tree[node] += (end - start + 1) * val; // Update the segment + // If not a leaf node, propagate the update to the children + if (start != end) { + lazy[2 * node] += val; + lazy[2 * node + 1] += val; + } + return; + } + + // Otherwise, split the interval and update both halves + int mid = (start + end) / 2; + update(2 * node, start, mid, l, r, val); + update(2 * node + 1, mid + 1, end, l, r, val); + + // After updating the children, recalculate the current node's value + tree[node] = tree[2 * node] + tree[2 * node + 1]; +} +``` + +This is the update function for given problem. Notice that when we arrive a node, all the updates that we postponed that would effect this node will be performed since we are pushing them downwards as we go to this node. Thus this node will keep the exact values when the range updates are done without lazy. So it’s seems like it is working. How about queries? + +### Queries Using Lazy Propogation +Since we have changed the update function to postpone the update operation, we will have to change the query function as well. The only change we need to make is to check if there is any pending update operation on that node. If there is a pending update, first update the node and then proceed the same way as the earlier query function. As mentioned in the previous subsection, all the postponed updates that would affect this node will be performed before we reach it. Therefore, the sum value we look for will be correct. + +```c++ +int query(int node, int start, int end, int l, int r) { + // If the current interval [start, end] does not intersect with [l, r], return 0 + if (start > r || end < l) { + return 0; + } + + // If there's a pending update on the current node, apply it + if (lazy[node] != 0) { + tree[node] += (end - start + 1) * lazy[node]; // Apply the pending update + // If not a leaf node, propagate the lazy update to the children + if (start != end) { + lazy[2 * node] += lazy[node]; + lazy[2 * node + 1] += lazy[node]; + } + lazy[node] = 0; // Clear the pending update + } + + // If the current interval [start, end] is completely within [l, r], return the value + if (l <= start && end <= r) { + return tree[node]; + } + + // Otherwise, split the interval and query both halves + int mid = (start + end) / 2; + int p1 = query(2 * node, start, mid, l, r); // Query the left child + int p2 = query(2 * node + 1, mid + 1, end, l, r); // Query the right child + + // Combine the results from the left and right child nodes + return (p1 + p2); +} +``` +Notice that the only difference with the regular query function is pushing the lazy values downwards as we traverse. This is a widely used trick applicable to various problems, though not all range problems. You may notice that we leveraged properties of addition here. The associative property of addition allows merging multiple updates in the lazy array without considering their order. This assumption is crucial for lazy propagation. Other necessary properties are left as an exercise to the reader. + +## Binary Search on Segment Tree +Assume we have an array A that contains elements between 1 and $M$. We have to perform 2 kinds of operations. + +- Change the value of the element in given index i by x. +- Return the value of the kth element on the array when sorted. + +### How to Solve It Naively +Let’s construct a frequency array, $F[i]$ will keep how many times number i occurs in our original array. So we want to find smallest $i$ such that $\sum_{j=1}^{i} F[i] \geq k$. Then the number $i$ will be our answer for the query. And for updates we just have to change $F$ array accordingly. + +
+![naive updates](img/naive_update.png){ width="100%" } +
A naive update example
+
+ +This is the naive algorithm. Update is O(1) and query is O(M). + +```c++ +void update(int i, int x) { + F[A[i]]--; + F[A[i] = x]++; +} + +int query(int k) { + int sum = 0, ans = 0; + // Iterate through the frequency array F to find the smallest value + // for which the cumulative frequency is at least k + for (int i = 1; i <= M; i++) { + sum += F[i]; // Add the frequency of F[i] to the cumulative sum + if (sum >= k) { + return i; + } + } +} +``` + +### How to Solve It With Segment Tree +This is of course, slow. Let’s use segment tree’s to improve it. First we will construct a segment tree on $F$ array. Segment tree will perform single element updates and range sum queries. We will use binary search to find corresponding $i$ for $k^{th}$ element queries. + +
+![segment tree updates](img/updated_segtree.png){ width="100%" } +
Segment Tree After First Update
+
+ +```cpp +void update(int i, int x) { + update(1, 1, M, A[i], --F[A[i]]); // Decrement frequency of old value + A[i] = x; // Update A[i] to new value + update(1, 1, M, A[i], ++F[A[i]]); // Increment frequency of new value +} + +int query(int k) { + int l = 1, r = m; // Initialize binary search range + while (l < r) { + int mid = (l + r) / 2; + if (query(1, 1, M, 1, mid) < k) + l = mid + 1; // Move lower bound up + else + r = mid; // Move upper bound down + } + return l; // Return index where cumulative frequency is at least k +} +``` + +If you look at the code above you can notice that each update takes $\mathcal{O}(\log M)$ time and each query takes $\mathcal{O}(\log^{2} M)$ time, but we can do better. + +### How To Speed Up? +If you look at the segment tree solution on preceding subsection you can see that queries are performed in $\mathcal{O}(\log^{2} M)$ time. We can make is faster, actually we can reduce the time complexity to $\mathcal{O}(\log M)$ which is same with the time complexity for updates. We will do the binary search when we are traversing the segment tree. We first will start from the root and look at its left child’s sum value, if this value is greater than k, this means our answer is somewhere in the left child’s subtree. Otherwise it is somewhere in the right child’s subtree. We will follow a path using this rule until we reach a leaf, then this will be our answer. Since we just traversed $\mathcal{O}(\log M)$ nodes (one node at each level), time complexity will be $\mathcal{O}(\log M)$. Look at the code below for better understanding. + +
+![solution of first query](img/query_soln.png){ width="100%" } +
Solution of First Query
+
+ +```cpp +void update(int i, int x) { + update(1, 1, M, A[i], --F[A[i]]); // Decrement frequency of old value + A[i] = x; // Update A[i] to new value + update(1, 1, M, A[i], ++F[A[i]]); // Increment frequency of new value +} + +int query(int node, int start, int end, int k) { + if (start == end) return start; // Leaf node, return the index + int mid = (start + end) / 2; + if (tree[2 * node] >= k) + return query(2 * node, start, mid, k); // Search in left child + return query(2 * node + 1, mid + 1, end, k - tree[2 * node]); // Search in right child +} + +int query(int k) { + return query(1, 1, M, k); // Public interface for querying +} +``` diff --git a/docs/data-structures/sparse-table.md b/docs/data-structures/sparse-table.md new file mode 100644 index 0000000..5cc9351 --- /dev/null +++ b/docs/data-structures/sparse-table.md @@ -0,0 +1,84 @@ +--- +title: Sparse Table +tags: + - Data Structures + - Sparse Table +--- + +Sparse table aralıklardaki elemanların toplamı, minimumu, maksimumu ve EBOB'ları gibi sorgulara $\mathcal{O}(\log N)$ zaman karmaşıklığında cevap alabilmemizi sağlayan bir veri yapısıdır. Bazı tip sorgular (aralıktaki minimum, maksimum sayıyı bulma gibi) ise $\mathcal{O}(1)$ zaman karmaşıklığında yapmaya uygundur. + +Bu veri yapısı durumu değişmeyen, sabit bir veri üzerinde ön işlemler yaparak kurulur. Dinamik veriler için kullanışlı değildir. Veri üzerinde herhangi bir değişiklik durumda Sparse table tekrardan kurulmalıdır. Bu da maliyetli bir durumdur. + +## Yapısı ve Kuruluşu + +Sparse table iki bouyutlu bir dizi şeklinde, $\mathcal{O}(N\log N)$ hafıza karmaşıklığına sahip bir veri yapısıdır. Dizinin her elemanından $2$'nin kuvvetleri uzaklıktaki elemanlara kadar olan cevaplar Sparse table'da saklanır. $ST_{x,i}$, $x$ indeksli elemandan $x + 2^i - 1$ indeksli elemana kadar olan aralığın cevabını saklayacak şekilde sparse table kurulur. + +```c++ +// Toplam sorgusu icin kurulmus Sparse Table Yapisi +const int n; +const int LOG = log2(n); +int a[n + 1], ST[2 * n][LOG + 1]; + +void build() { + for (int i = 1; i <= n; i++) { + // [i,i] araliginin cevabi dizinin i indeksli elemanina esittir. + ST[i][0] = a[i]; + } + + for (int i = 1; i <= LOG; i++) + for (int j = 1; j <= n; j++) { + // [i,i+2^(j)-1] araliginin cevabi + // [i,i+2^(j - 1) - 1] araligi ile [i+2^(j - 1),i+2^j-1] araliginin + // cevaplarinin birlesmesiyle elde edilir + ST[i][j] = ST[i][j - 1] + ST[i + (1 << (j - 1))][j - 1]; + } + + return; +} +``` + +## Sorgu Algoritması + +Herhangi bir $[l,r]$ aralığı için sorgu algoritması sırasıyla şu şekilde çalışır: + +- $[l,r]$ aralığını cevaplarını önceden hesapladığımız aralıklara parçala. + - Sadece $2$'nin kuvveti uzunluğunda parçaların cevaplarını sakladığımız için aralığımızı $2$'nin kuvveti uzunluğunda aralıklara ayırmalıyız. $[l,r]$ aralığının uzunluğunun ikilik tabanda yazdığımızda hangi aralıklara parçalamamız gerektiğini bulmuş oluruz. +- Bu aralıklardan gelen cevapları birleştirerek $[l,r]$ aralığının cevabını hesapla. + +Herhangi bir aralığın uzunluğunun ikilik tabandaki yazılışındaki $1$ rakamlarının sayısı en fazla $\log(N)$ olabileceğinden parçalayacağımız aralık sayısı da en fazla $\log(N)$ olur. Dolayısıyla sorgu işlemimiz $\mathcal{O}(\log N)$ zaman karmaşıklığında çalışır. + +Örneğin: $[4,17]$ aralığının cevabını hesaplamak için algoritmamız $[4,17]$ aralığını $[4,11]$, $[12,15]$ ve $[16,17]$ aralıklarına ayırır ve bu $3$ aralıktan gelen cevapları birleştirerek istenilen cevabı hesaplar. + +```c++ +// toplam sorgusu +int query(int l, int r) { + int res = 0; + + for (int i = LOG; i >= 0; i--) { + // her seferinde uzunlugu r - l + 1 gecmeyecek + // en buyuk araligin cevabi ekleyip l'i o araligin sonuna cekiyoruz. + if (l + (1 << i) <= r) { + res += ST[l][i]; + l += (1 << i); + } + } + + return res; +} +``` + +## Minimum ve Maksimum Sorgu + +Sparse Table veri yapısının diğer veri yapılarından farklı olarak $\mathcal{O}(1)$ zaman karmaşıklığında aralıklarda minimum veya maksimum sorgusu yapabilmesi en avantajlı özelliğidir. + +Herhangi bir aralığın cevabını hesaplarken bu aralıktaki herhangi bir elemanı birden fazla kez değerlendirmemiz cevabı etkilemez. Bu durum aralığımızı $2$'nin kuvveti uzunluğunda maksimum $2$ adet aralığa bölebilmemize ve bu aralıkların cevaplarını $\mathcal{O}(1)$ zaman karmaşıklığında birleştirebilmemize olanak sağlar. + +```c++ +int RMQ(int l, int r) { + // log[] dizisinde her sayinin onceden hesapadigimiz log2 degerleri saklidir. + int j = log[r - l + 1]; + return min(ST[l][j], ST[r - (1 << j) + 1][j]); +} +``` + +Sparse Table veri yapısı ile ilgili örnek bir probleme [buradan](https://www.spoj.com/problems/RMQSQ){target="_blank"} ulaşabilirsiniz. diff --git a/docs/data-structures/sqrt-decomposition.md b/docs/data-structures/sqrt-decomposition.md new file mode 100644 index 0000000..07e0de5 --- /dev/null +++ b/docs/data-structures/sqrt-decomposition.md @@ -0,0 +1,191 @@ +--- +title: SQRT Decomposition +tags: + - Data Structures + - SQRT Decomposition + - Square Root Decomposition +--- + +Square Root Decomposition algoritması dizi üzerinde $\mathcal{O}(\sqrt{N})$ zaman karmaşıklığında sorgu yapabilmemize ve $\mathcal{O}(1)$ zaman karmaşıklığında ise değişiklik yapabilmemize olanak sağlayan bir veri yapsıdır. + +## Yapısı ve Kuruluşu + +Dizinin elemanları her biri yaklaşık $\mathcal{O}(\sqrt{N})$ uzunluğunda bloklar halinde parçalanır. Her bir blokun cevabı ayrı ayrı hesaplanır ve bir dizide saklanır. + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Blokların Cevapları$21$$13$$50$$32$
Dizideki Elemanlar$3$$6$$2$$10$$3$$1$$4$$5$$2$$7$$37$$4$$11$$6$$8$$7$
Elemanların İndeksleri$1$$2$$3$$4$$5$$6$$7$$8$$9$$10$$11$$12$$13$$14$$15$$16$
+ + *Örnek bir dizi üzerinde toplam sorgusu için kurulmuş SQRT Decompostion veri yapısı.* +
+ +```c++ +void build() { + for (int i = 1; i <= n; i++) { + if (i % sq == 1) { // sq = sqrt(n) + t++; // yeni blok baslangici. + st[t] = i; // t.blok i indisli elemanda baslar. + } + fn[t] = i; // t.blokun bitisini i indisli eleman olarak guncelliyoruz. + wh[i] = t; // i indeksli eleman t.blogun icinde. + sum[t] += a[i]; // t. blokun cevabina i indeksli elemani ekliyoruz. + } +} +``` + +## Sorgu Algoritması + +Herhangi bir $[l,r]$ aralığı için sorgu algoritması sırası ile şu şekilde çalışır: + +1. Cevabını aradığımız aralığın tamamen kapladığı blokların cevabını cevabımıza ekliyoruz. +2. Tamamen kaplamadığı bloklardaki aralığımızın içinde olan elemanları tek tek gezerek cevabımıza ekliyoruz. + +Cevabını aradığımız aralığın kapsadığı blok sayısı en fazla $\sqrt{N}$ olabileceğinden $1.$ işlem en fazla $\sqrt{N}$ kez çalışır. Tamamen kaplamadığı ancak bazı elemanları içeren en fazla $2$ adet blok olabilir. (Biri en solda diğeri en sağda olacak şekilde.) Bu $2$ blok için de gezmemiz gereken eleman sayısı maksimum $2\sqrt{N}$ olduğundan bir sorgu işleminde en fazla $3\sqrt{N}$ işlem yapılır, dolayısıyla sorgu işlemimiz $\mathcal{O}(\sqrt{N})$ zaman karmaşıklığında calışır. + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Blokların Cevapları$21$$13$$50$$32$
Dizideki Elemanlar$3$$6$$2$$10$$3$$1$$4$$5$$2$$7$$37$$4$$11$$6$$8$$7$
Elemanların İndeksleri$1$$2$$3$$4$$5$$6$$7$$8$$9$$10$$11$$12$$13$$14$$15$$16$
+ + *Örnek dizideki $[3,13]$ aralığının cevabını $2.$ ve $3.$ blokların cevapları ile $3,4$ ve $11$ indeksli elemanların toplanmasıyla elde edilir.* +
+ +```c++ +// [l,r] araligindaki elemanlarin toplamini hesaplayan fonksiyon. +int query(int l, int r) { + int res = 0; + + if (wh[l] == wh[r]) { // l ve r ayni blogun icindeyse + for (int i = l; i <= r; i++) + res += a[i]; + } else { + for (int i = wh[l] + 1; i <= wh[r] - 1; i++) + res += sum[i]; // tamamen kapladigimiz bloklarin cevaplarini ekliyoruz. + + // tamamen kaplamadigimiz bloklardaki araligimiz icindeki + // elemanlarin cevaplarini ekliyoruz. + + for (int i = st[wh[l]]; i <= fn[wh[l]]; i++) + if (i >= l && i <= r) + res += a[i]; + + for (int i = st[wh[r]]; i <= fn[wh[r]]; i++) + if (i >= l && i <= r) + res += a[i]; + } + + return res; +} +``` + +## Eleman Güncelleme Algoritması + +Herhangi bir elemanın değerini güncellerken o elemanı içeren blokun değerini güncellememiz yeterli olacaktır. Dolayısıyla güncelleme işlemimimiz $\mathcal{O}(1)$ zaman karmaşıklığında çalışır. + +```c++ +void update(int x, int val) { + // x indeksli elemanin yeni degerini val degerine esitliyoruz. + sum[wh[x]] -= a[x]; + a[x] = val; + sum[wh[x]] += a[x]; +} +``` + +SQRT Decomposition veri yapısı ile ilgili örnek bir probleme [buradan](https://codeforces.com/contest/13/problem/E){target="_blank"} ulaşabilirsiniz. diff --git a/docs/data-structures/stack.md b/docs/data-structures/stack.md new file mode 100644 index 0000000..1e2f686 --- /dev/null +++ b/docs/data-structures/stack.md @@ -0,0 +1,29 @@ +--- +title: Stack +tags: + - Data Structures + - Stack +--- + +Stack veri yapısında elemanlar yapıya son giren ilk çıkar (LIFO) kuralına uygun olacak şekilde saklanır. Bu veri yapısında uygulayabildiğimiz işlemler: + +- Veri yapısının en üstüne eleman ekleme. +- Veri yapısının en üstündeki elemana erişim. +- Veri yapısının en üstündeki elemanı silme. +- Veri yapısının boş olup olmadığının kontrölü. + +C++ dilindeki STL kütüphanesinde bulunan hazır stack yapısının kullanımı aşağıdaki gibidir: + +```c++ +int main() { + stack st; + cout << st.empty() << endl; // Ilk bashta Stack bosh oldugu icin burada True donecektir. + st.push(5); // Stack'in en ustune 5'i ekler. Stack'in yeni hali: {5} + st.push(7); // Stack'in en ustune 7'yi ekler. Stack'in yeni hali: {7, 5} + st.push(6); // Stack'in en ustune 6'yi ekler. Stack'in yeni hali : {6, 7, 5} + st.pop(); // Stack'in en ustundeki elemani siler. Stack'in yeni hali : {7, 5} + st.push(1); // Stack'in en ustune 1'i ekler. Stack'in yeni hali : {1, 7, 5} + cout << st.top() << endl; // Stack'in en ustundeki elemana erisir. Ekrana 1 yazirir. + cout << st.empty() << endl; // Burada Stack bosh olmadigindan oturu False donecektir. +} +``` \ No newline at end of file diff --git a/docs/data-structures/trie.md b/docs/data-structures/trie.md new file mode 100644 index 0000000..87ab436 --- /dev/null +++ b/docs/data-structures/trie.md @@ -0,0 +1,66 @@ +--- +title: Trie +tags: + - Data Structures + - Trie +--- + +Trie is an efficient information reTrieval data structure. Using Trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to $M \times log N$, where $M$ is maximum string length and $N$ is number of keys in tree. Using Trie, we can search the key in $\mathcal{O}(M)$ time. However the penalty is on Trie storage requirements (Please refer [Applications of Trie](https://www.geeksforgeeks.org/advantages-trie-data-structure/) for more details) + + +
+![Trie Structure https://www.geeksforgeeks.org/wp-content/uploads/Trie.png](img/trie.png) +
Trie Structure. https://www.geeksforgeeks.org/wp-content/uploads/Trie.png
+
+ +Every node of Trie consists of multiple branches. Each branch represents a possible character of keys. We need to mark the last node of every key as end of word node. A Trie node field isEndOfWord is used to distinguish the node as end of word node. A simple structure to represent nodes of English alphabet can be as following, + +```cpp +// Trie node +class TrieNode { + public: + TrieNode *children[ALPHABET_SIZE]; + bool isEndOfWord; + TrieNode() { + isEndOfWord = false; + for (int i = 0; i < ALPHABET SIZE; i++) + children[i] = NULL; + } +}; +``` + +## Insertion + +Inserting a key into Trie is simple approach. Every character of input key is inserted as an individual Trie node. Note that the children is an array of pointers (or references) to next level Trie nodes. The key character acts as an index into the array children. If the input key is new or an extension of existing key, we need to construct non-existing nodes of the key, and mark end of word for last node. If the input key is prefix of existing key in Trie, we simply mark the last node of key as end of word. The key length determines Trie depth. + +```cpp +void insert(struct TrieNode *root, string key) { + struct TrieNode *pCrawl = root; + for (int i = 0; i < key.length(); i++) { + int index = key[i] - 'a'; + if (!pCrawl->children[index]) + pCrawl->children[index] = new TrieNode; + pCrawl = pCrawl->children[index]; + } + pCrawl->isEndOfWord = true; +} +``` + +## Search + +Searching for a key is similar to insert operation, however we only compare the characters and move down. The search can terminate due to end of string or lack of key in Trie. In the former case, if the isEndofWord field of last node is true, then the key exists in Trie. In the second case, the search terminates without examining all the characters of key, since the key is not present in Trie. + +```cpp +bool search(struct TrieNode *root, string key) { + TrieNode *pCrawl = root; + for (int i = 0; i < key.length(); i++) { + int index = key[i] - 'a'; + if (!pCrawl->children[index]) + return false; + pCrawl = pCrawl->children[index]; + } + return (pCrawl != NULL && pCrawl->isEndOfWord); +} +``` + +Insert and search costs $\mathcal{O}(\text{key\_length})$. However the memory requirements of Trie high. It is $\mathcal{O}(\text{ALPHABET SIZE} \times \text{key\_length} \times N)$ where $N$ is number of keys in Trie. There are efficient representation of trie nodes (e.g. compressed trie, ternary search tree, etc.) to minimize memory requirements of trie. diff --git a/docs/dynamic-programming/bitmask-dp.md b/docs/dynamic-programming/bitmask-dp.md new file mode 100644 index 0000000..64459d6 --- /dev/null +++ b/docs/dynamic-programming/bitmask-dp.md @@ -0,0 +1,89 @@ +--- +title: Bitmask DP +tags: + - Dynamic Programming + - Bitmask DP +--- + +## What is Bitmask? + +Let’s say that we have a set of objects. How can we represent a subset of this set? One way is using a map and mapping each object with a Boolean value indicating whether the object is picked. Another way is, if the objects can be indexed by integers, we can use a Boolean array. However, this can be slow due to the operations of the map and array structures. If the size of the set is not too large (less than 64), a bitmask is much more useful and convenient. + +An integer is a sequence of bits. Thus, we can use integers to represent a small set of Boolean values. We can perform all the set operations using bit operations. These bit operations are faster than map and array operations, and the time difference may be significant in some problems. + +In a bitmask, the \( i \)-th bit from the right represents the \( i \)-th object. For example, let \( A = \{1, 2, 3, 4, 5\} \), we can represent \( B = \{1, 2, 4\} \) with the 11 (01011) bitmask. + +--- + +## Bitmask Operations + +- **Add the \( i \)-th object to the subset:** + Set the \( i \)-th bit to 1: + \( \text{mask } = \text{mask } | \text{ } (1 << i) \) + +- **Remove the \( i \)-th object from the subset:** + Set the \( i \)-th bit to 0: + \( \text{mask } = \text{mask } \& \sim (1 << i) \) + +- **Check whether the \( i \)-th object is in the subset:** + Check if the \( i \)-th bit is set: + \( \text{mask } \& \text{ } (1 << i) \). + If the expression is equal to 1, the \( i \)-th object is in the subset. If the expression is equal to 0, the \( i \)-th object is not in the subset. + +- **Toggle the existence of the \( i \)-th object:** + XOR the \( i \)-th bit with 1, turning 1 into 0 and 0 into 1: + \( \text{mask} = \text{mask}\) ^ \( (1 << i) \) + +- **Count the number of objects in the subset:** + Use a built-in function to count the number of 1’s in an integer variable: + `__builtin_popcount(mask)` for integers or `__builtin_popcountll(mask)` for long longs. + +--- + +## Iterating Over Subsets + +- **Iterate through all subsets of a set with size \( n \):** + \( \text{for (int x = 0; x < (1 << n); ++x)} \) + +- **Iterate through all subsets of a subset with the mask \( y \):** + \( \text{for (int x = y; x > 0; x = (y \& (x − 1)))} \) + +--- + +## Task Assignment Problem + +There are \( N \) people and \( N \) tasks, and each task is going to be allocated to a single person. We are also given a matrix `cost` of size \( N \times N \), where `cost[i][j]` denotes how much a person is going to charge for a task. We need to assign each task to a person such that the total cost is minimized. Note that each task is allocated to only one person, and each person is allocated only one task. + +### Naive Approach: + +Try \( N! \) possible assignments. +**Time complexity:** \( O(N!) \). + +### DP Approach: + +For every possible subset, find the new subsets that can be generated from it and update the DP array. Here, we use bitmasking to represent subsets and iterate over them. +**Time complexity:** \( O(2^N \times N) \). + +**Note:** The [Hungarian Algorithm](https://en.wikipedia.org/wiki/Hungarian_algorithm) solves this problem in \( O(N^3) \) time complexity. + +Solution code for DP approach: + +```cpp +for (int mask = 0; mask < (1 << n); ++mask) +{ + for (int j = 0; j < n; ++j) + { + if((mask & (1 << j)) == 0) // jth task not assigned + { + dp[mask | (1 << j)] = min(dp[mask | (1 << j)], dp[mask] + cost[__builtin_popcount(mask)][j]) + } + } +} +// after this operation our answer stored in dp[(1 << N) - 1] +``` + +--- + +## References + +- [Bitmask Tutorial on HackerEarth](https://www.hackerearth.com/practice/algorithms/dynamic-programming/bit-masking/tutorial/) \ No newline at end of file diff --git a/docs/dynamic-programming/common-dp-problems.md b/docs/dynamic-programming/common-dp-problems.md new file mode 100644 index 0000000..0ab2a15 --- /dev/null +++ b/docs/dynamic-programming/common-dp-problems.md @@ -0,0 +1,236 @@ +--- +title: Common Dynamic Programming Problems +tags: + - Dynamic Programming + - Common Dynamic Programming Problems +--- + +## Coin Problem + +As discussed earlier, the Greedy approach doesn’t work all the time for the coin problem. For example, if the coins are \{4, 3, 1\} and the target sum is \(6\), the greedy algorithm produces the solution \(4+1+1\), while the optimal solution is \(3+3\). This is where Dynamic Programming (DP) helps. + +### Solution + +#### Approach: + +1. If \( V == 0 \), then 0 coins are required. +2. If \( V > 0 \), compute \( \text{minCoins}(coins[0..m-1], V) = \min \{ 1 + \text{minCoins}(V - \text{coin}[i]) \} \) for all \( i \) where \( \text{coin}[i] \leq V \). + +```python +def minCoins(coins, target): + # base case + if (V == 0): + return 0 + + n = len(coins) + # Initialize result + res = sys.maxsize + + # Try every coin that has smaller value than V + for i in range(0, n): + if (coins[i] <= target): + sub_res = minCoins(coins, target-coins[i]) + + # Check for INT_MAX to avoid overflow and see if + # result can minimized + if (sub_res != sys.maxsize and sub_res + 1 < res): + res = sub_res + 1 + + return res +``` + +## Knapsack Problem + +We are given the weights and values of \( n \) items, and we are to put these items in a knapsack of capacity \( W \) to get the maximum total value. In other words, we are given two integer arrays `val[0..n-1]` and `wt[0..n-1]`, which represent the values and weights associated with \( n \) items. We are also given an integer \( W \), which represents the knapsack's capacity. Our goal is to find out the maximum value subset of `val[]` such that the sum of the weights of this subset is smaller than or equal to \( W \). We cannot break an item; we must either pick the complete item or leave it. + +#### Approach: + +There are two cases for every item: +1. The item is included in the optimal subset. +2. The item is not included in the optimal subset. + +The maximum value that can be obtained from \( n \) items is the maximum of the following two values: +1. Maximum value obtained by \( n-1 \) items and \( W \) weight (excluding the \( n \)-th item). +2. Value of the \( n \)-th item plus the maximum value obtained by \( n-1 \) items and \( W - \text{weight of the } n \)-th item (including the \( n \)-th item). + +If the weight of the \( n \)-th item is greater than \( W \), then the \( n \)-th item cannot be included, and case 1 is the only possibility. + +For example: + +- Knapsack max weight: \( W = 8 \) units +- Weight of items: \( \text{wt} = \{3, 1, 4, 5\} \) +- Values of items: \( \text{val} = \{10, 40, 30, 50\} \) +- Total items: \( n = 4 \) + +The sum \( 8 \) is possible with two combinations: \{3, 5\} with a total value of 60, and \{1, 3, 4\} with a total value of 80. However, a better solution is \{1, 5\}, which has a total weight of 6 and a total value of 90. + +### Recursive Solution + +```python +def knapSack(W , wt , val , n): + + # Base Case + if (n == 0 or W == 0): + return 0 + + # If weight of the nth item is more than Knapsack of capacity + # W, then this item cannot be included in the optimal solution + if (wt[n-1] > W): + return knapSack(W, wt, val, n - 1) + + # return the maximum of two cases: + # (1) nth item included + # (2) not included + else: + return max(val[n-1] + knapSack(W - wt[n - 1], wt, val, n - 1), knapSack(W, wt, val, n - 1)) +``` + +### Dynamic Programming Solution + +It should be noted that the above function computes the same subproblems again and again. Time complexity of this naive recursive solution is exponential \(2^n\). +Since suproblems are evaluated again, this problem has Overlapping Subproblems property. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by constructing a temporary array \(K[][]\) in bottom up manner. Following is Dynamic Programming based implementation. + +```python +def knapSack(W, wt, val, n): + K = [[0 for x in range(W + 1)] for x in range(n + 1)] + + # Build table K[][] in bottom up manner + for (i in range(n + 1)): + for (w in range(W + 1)): + if (i == 0 or w == 0): + K[i][w] = 0 + elif (wt[i - 1] <= w): + K[i][w] = max(val[i - 1] + K[i - 1][w - wt[i - 1]], K[i - 1][w]) + else: + K[i][w] = K[i - 1][w] + + return K[n][W] +``` + +## Longest Common Substring (LCS) Problem + +We are given two strings \( X \) and \( Y \), and our task is to find the length of the longest common substring. + +### Sample Case: + +- Input: \( X = "inzvahackerspace" \), \( Y = "spoilerspoiler" \) +- Output: 4 + +The longest common substring is "ersp" and is of length 4. + +#### Approach: + +Let \( m \) and \( n \) be the lengths of the first and second strings, respectively. A simple solution is to consider all substrings of the first string one by one and check if they are substrings of the second string. Keep track of the maximum-length substring. There will be \( O(m^2) \) substrings, and checking if one is a substring of the other will take \( O(n) \) time. Thus, the overall time complexity is \( O(n \cdot m^2) \). + +Dynamic programming can reduce this to \( O(m \cdot n) \). The idea is to find the length of the longest common suffix for all substrings of both strings and store these lengths in a table. The longest common suffix has the following property: + +\[ +LCSuff(X, Y, m, n) = LCSuff(X, Y, m-1, n-1) + 1 \text{ if } X[m-1] = Y[n-1] +\] +Otherwise, \( LCSuff(X, Y, m, n) = 0 \). + +The maximum length of the Longest Common Suffix is the Longest Common Substring. + +### DP - Iterative + +```python +def LCSubStr(X, Y): + m = len(X) + n = len(Y) + + # Create a table to store lengths of + # longest common suffixes of substrings. + # Note that LCSuff[i][j] contains the + # length of longest common suffix of + # X[0...i−1] and Y[0...j−1]. The first + # row and first column entries have no + # logical meaning, they are used only + # for simplicity of the program. + + # LCSuff is the table with zero + # value initially in each cell + LCSuff = [[0 for k in range(n+1)] for l in range(m + 1)] + + # To store the length of + # longest common substring + result = 0 + + # Following steps to build + # LCSuff[m+1][n+1] in bottom up fashion + for (i in range(m + 1)): + for (j in range(n + 1)): + if (i == 0 or j == 0): + LCSuff[i][j] = 0 + elif (X[i - 1] == Y[j - 1]): + LCSuff[i][j] = LCSuff[i - 1][j - 1] + 1 + result = max(result, LCSuff[i][j]) + else: + LCSuff[i][j] = 0 + return result +``` + +### DP - Recursive + +```python +def lcs(int i, int j, int count): + if (i == 0 or j == 0): + return count + + if (X[i - 1] == Y[j - 1]): + count = lcs(i - 1, j - 1, count + 1) + + count = max(count, max(lcs(i, j - 1, 0), lcs(i - 1, j, 0))) + return count +``` + +## Longest Increasing Subsequence (LIS) Problem + +The Longest Increasing Subsequence (LIS) problem is to find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order. + +For example, given the array \([0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]\), the longest increasing subsequence has a length of 6, and it is \{0, 2, 6, 9, 11, 15\}. + +### Solution + +A naive, brute-force approach is to generate every possible subsequence, check for monotonicity, and keep track of the longest one. However, this is prohibitively expensive, as generating each subsequence takes \( O(2^N) \) time. + +Instead, we can use recursion to solve this problem and then optimize it with dynamic programming. We assume that we have a function that gives us the length of the longest increasing subsequence up to a certain index. + +The base cases are: +- The empty list, which returns 0. +- A list with one element, which returns 1. + +For every index \( i \), calculate the longest increasing subsequence up to that point. The result can only be extended with the last element if the last element is greater than \( \text{arr}[i] \), as otherwise, the sequence wouldn’t be increasing. + +```python +def longest_increasing_subsequence(arr): + if (not arr): + return 0 + if (len(arr) == 1): + return 1 + + max_ending_here = 0 + for (i in range(len(arr))): + ending_at_i = longest_increasing_subsequence(arr[:i]) + if (arr[-1] > arr[i - 1] and ending_at_i + 1 > max_ending_here): + max_ending_here = ending_at_i + 1 + return max_ending_here +``` + +This is really slow due to repeated subcomputations (exponential in time). So, let’s use dynamic +programming to store values to recompute them for later. + +We’ll keep an array A of length N, and A[i] will contain the length of the longest increasing subsequence ending at i. We can then use the same recurrence but look it up in the array instead: + +```python +def longest_increasing_subsequence(arr): + if (not arr): + return 0 + cache = [1] * len(arr) + for (i in range(1, len(arr))): + for (j in range(i)): + if (arr[i] > arr[j]): + cache[i] = max(cache[i], cache[j] + 1) + return max(cache) +``` + +This now runs in \( O(N^2) \) time and \( O(N) \) space. \ No newline at end of file diff --git a/docs/dynamic-programming/digit-dp.md b/docs/dynamic-programming/digit-dp.md new file mode 100644 index 0000000..41caf05 --- /dev/null +++ b/docs/dynamic-programming/digit-dp.md @@ -0,0 +1,109 @@ +--- +title: Digit DP +tags: + - Dynamic Programming + - Digit DP +--- + +Problems that require the calculation of how many numbers there are between two values (say, \( A \) and \( B \)) that satisfy a particular property can be solved using digit dynamic programming (Digit DP). + +--- + +## How to Work on Digits + +While constructing our numbers recursively (from the left), we need a method to check if our number is still smaller than the given boundary number. To achieve this, we keep a variable called "strict" while branching, which limits our ability to select digits that are larger than the corresponding digit of the boundary number. + +Let’s suppose the boundary number is \( A \). We start filling the number from the left (most significant digit) and set `strict` to `true`, meaning we cannot select any digit larger than the corresponding digit of \( A \). As we branch: + +- Values less than the corresponding digit of \( A \) will now be non-strict (`strict = false`) because we guarantee that the number will be smaller than \( A \) after this point. +- For values equal to the corresponding digit of \( A \), the strictness continues to be `true`. + +--- + +## Counting Problem Example + +**Problem:** How many numbers \( x \) are there in the range \( A \) to \( B \), where the digit \( d \) occurs exactly \( k \) times in \( x \)? + +**Constraints:** \( A, B < 10^{60}, k < 60 \). + +### Brute Force Approach: + +The brute-force solution would involve iterating over all the numbers in the range \([A, B]\) and counting the occurrences of the digit \( d \) one by one for each number. This has a time complexity of \( O(N \log_{10}(N)) \), which is too large for such constraints, and we need a more efficient approach. + +### Recursive Approach: + +We can recursively fill the digits of our number starting from the leftmost digit. At each step, we branch into 3 possibilities: + +1. Pick a number that is **not** \( d \) and smaller than the corresponding digit of the boundary number. +2. Pick the digit \( d \). +3. Pick a number that is equal to the corresponding digit of the boundary number. + +The depth of recursion is equal to the number of digits in the decimal representation of the boundary number, leading to a time complexity of \( O(3^{\log_{10} N}) \). Although this is better than brute force, it is still not efficient enough. + +### Recursive Approach with Memoization: + +We can further optimize this approach using memoization. We represent a DP state by \((\text{current index}, \text{current strictness}, \text{number of } d's)\), which denotes the number of possible configurations of the remaining digits after picking the current digit. We use a `dp[\log_{10} N][2][\log_{10} N]` array, where each value is computed at most once. Therefore, the worst-case time complexity is \( O((\log_{10} N)^2) \). + +Solution Code: + +```cpp +#include +using namespace std; +#define ll long long +ll A, B, d, k, dg; // dg: digit count +vector v; // digit vector +ll dp[25][2][25]; +void setup(ll a) +{ + memset(dp,0,sizeof dp); + v.clear(); + ll tmp = a; + while(tmp) + { + v.push_back(tmp%10); + tmp/=10; + } + dg = (ll)v.size(); + reverse(v.begin(), v.end()); +} + +ll rec(int idx, bool strict, int count) +{ + if(dp[idx][strict][count]) return dp[idx][strict][count]; + if(idx == dg or count > k) return (count == k); + ll sum = 0; + if(strict) + { + // all > A >> B >> d >> k; + setup(B); + ll countB = rec(0, 1, 0); //countB is answer of [0..B] + setup(A - 1); + ll countA = rec(0, 1, 0); //countA is answer of [0..A-1] + cout << fixed << countB - countA << endl; //difference gives us [A..B] +} +``` + +--- + +## References + +- [Digit DP on Codeforces](https://codeforces.com/blog/entry/53960) + +- [Digit DP on HackerRank](https://www.hackerrank.com/topics/digit-dp) \ No newline at end of file diff --git a/docs/dynamic-programming/dp-on-dags.md b/docs/dynamic-programming/dp-on-dags.md new file mode 100644 index 0000000..6cf5424 --- /dev/null +++ b/docs/dynamic-programming/dp-on-dags.md @@ -0,0 +1,89 @@ +--- +title: DP on Directed Acyclic Graphs (DAGs) +tags: + - Dynamic Programming + - DP on Directed Acyclic Graphs (DAGs) +--- + +As we know, the nodes of a directed acyclic graph (DAG) can be sorted topologically, and DP can be implemented efficiently using this topological order. + +First, we can find the topological order with a [topological sort](https://en.wikipedia.org/wiki/Topological_sorting) in \( O(N) \) time complexity. Then, we can find the \( dp(V) \) values in topological order, where \( V \) is a node in the DAG and \( dp(V) \) is the answer for node \( V \). The answer and implementation will differ depending on the specific problem. + +--- + +## Converting a DP Problem into a Directed Acyclic Graph + +Many DP problems can be converted into a DAG. Let’s explore why this is the case. + +While solving a DP problem, when we process a state, we evaluate it by considering all possible previous states. To do this, all of the previous states must be processed before the current state. From this perspective, some states depend on other states, forming a DAG structure. + +However, note that some DP problems cannot be converted into a DAG and may require [hyper-graphs](https://en.wikipedia.org/wiki/Hypergraph). (For more details, refer to [**Advanced Dynamic Programming in Semiring and Hypergraph Frameworks**](https://en.wikipedia.org/wiki/Hypergraph)). + +### Example Problem: + +There are \( N \) stones numbered \( 1, 2, ..., N \). For each \( i \) ( \( 1 \leq i \leq N \) ), the height of the \( i \)-th stone is \( h_i \). There is a frog initially on stone 1. The frog can jump to stone \( i+1 \) or stone \( i+2 \). The cost of a jump from stone \( i \) to stone \( j \) is \( | h_i − h_j | \). Find the minimum possible cost to reach stone \( N \). + +### Solution: + +We define \( dp[i] \) as the minimum cost to reach the \( i \)-th stone. The answer will be \( dp[N] \). The recurrence relation is defined as: + +\[ +dp[i] = \min(dp[i−1] + |h_i − h_{i−1}|, dp[i−2] + |h_i − h_{i−2}|) +\] + +For \( N = 5 \), we can see that to calculate \( dp[5] \), we need to calculate \( dp[4] \) and \( dp[3] \). Similarly: + +- \( dp[4] \) depends on \( dp[3] \) and \( dp[2] \), +- \( dp[3] \) depends on \( dp[2] \) and \( dp[1] \), +- \( dp[2] \) depends on \( dp[1] \). + +These dependencies form a DAG, where the nodes represent the stones, and the edges represent the transitions between them based on the jumps. + +```mermaid +graph LR + A(dp_1) --> B(dp_2); + A --> C(dp_3); + B --> D(dp_4); + B --> E(dp_5); + C --> D; + D --> E; +``` + +## DP on Directed Acyclic Graph Problem + +Given a DAG with \( N \) nodes and \( M \) weighted edges, find the **longest path** in the DAG. + +### Complexity: + +The time complexity for this problem is \( O(N + M) \), where \( N \) is the number of nodes and \( M \) is the number of edges. + +Solution Code: + +```cpp +// topological sort is not written here so we will take tp as it is already sorted +// note that tp is reverse topologically sorted +// vector tp +// n , m and vector > adj is given.Pair denotes {node,weight}. +// flag[] denotes whether a node is processed or not.Initially all zero. +// dp[] is DP array.Initially all zero. + +for (int i = 0; i < (int)tp.size(); ++i)//processing in order +{ + int curNode = tp[i]; + + for (auto v : adj[curNode]) //iterate through all neighbours + if(flag[v.first]) //if a neighbour is already processed + dp[curNode] = max(dp[curNode] , dp[v.first] + v.second); + + flag[curNode] = 1; +} +//answer is max(dp[1..n]) +``` + +--- + +## References + +- [NOI IOI training week-5](https://noi.ph/training/weekly/week5.pdf) + +- [DP on Graphs MIT](https://courses.csail.mit.edu/6.006/fall11/rec/rec19.pdf) \ No newline at end of file diff --git a/docs/dynamic-programming/dp-on-rooted-trees.md b/docs/dynamic-programming/dp-on-rooted-trees.md new file mode 100644 index 0000000..23aadf4 --- /dev/null +++ b/docs/dynamic-programming/dp-on-rooted-trees.md @@ -0,0 +1,72 @@ +--- +title: DP on Rooted Trees +tags: + - Dynamic Programming + - DP on Rooted Trees +--- + +In dynamic programming (DP) on rooted trees, we define functions for the nodes of the tree, which are calculated recursively based on the children of each node. One common DP state is usually associated with a node \(i\), representing the sub-tree rooted at node \(i\). + +--- + +## Problem + +Given a tree \( T \) of \( N \) (1-indexed) nodes, where each node \( i \) has \( C_i \) coins attached to it, the task is to choose a subset of nodes such that no two adjacent nodes (nodes directly connected by an edge) are chosen, and the sum of coins attached to the chosen subset is maximized. + +### Approach: + +We define two functions, \( dp1(V) \) and \( dp2(V) \), as follows: + +- \( dp1(V) \): The optimal solution for the sub-tree of node \( V \) when node \( V \) **is included** in the answer. +- \( dp2(V) \): The optimal solution for the sub-tree of node \( V \) when node \( V \) **is not included** in the answer. + +The final answer is the maximum of these two cases: + +\[ +\text{max}(dp1(V), dp2(V)) +\] + +### Recursive Definitions: + +- \( dp1(V) = C_V + \sum_{i=1}^{n} dp2(v_i) \), where \( n \) is the number of children of node \( V \), and \( v_i \) is the \( i \)-th child of node \( V \). + This represents the scenario where node \( V \) is included in the chosen subset, so none of its children can be selected. + +- \( dp2(V) = \sum_{i=1}^{n} \text{max}(dp1(v_i), dp2(v_i)) \). + This represents the scenario where node \( V \) is not included, so the optimal solution for each child \( v_i \) can either include or exclude that child. + +### Complexity: + +The time complexity for this approach is \( O(N) \), where \( N \) is the number of nodes in the tree. This is because the solution involves a depth-first search (DFS) traversal of the tree, and each node is visited only once. + +```cpp +//pV is parent of V +void dfs(int V, int pV) +{ + //base case: + //when dfs reaches a leaf it finds dp1 and dp2 and does not branch again. + + //for storing sums of dp1 and max(dp1, dp2) for all children of V + int sum1 = 0, sum2 = 0; + + //traverse over all children + for (auto v : adj[V]) + { + if (v == pV) + continue; + dfs(v, V); + sum1 += dp2[v]; + sum2 += max(dp1[v], dp2[v]); + } + + dp1[V] = C[V] + sum1; + dp2[V] = sum2; +} +//Nodes are 1-indexed, therefore our answer stored in dp1[1] and dp2[1] +//for the answer we take max(dp1[1],dp2[1]) after calling dfs(1,0). +``` + +--- + +## References + +- [DP on Tree on CodeForces](https://codeforces.com/blog/entry/20935) \ No newline at end of file diff --git a/docs/dynamic-programming/dynamic-programming.md b/docs/dynamic-programming/dynamic-programming.md new file mode 100644 index 0000000..77799f1 --- /dev/null +++ b/docs/dynamic-programming/dynamic-programming.md @@ -0,0 +1,115 @@ +--- +title: Dynamic Programming +tags: + - Dynamic Programming +--- + +Dynamic programming (DP) is a technique used to avoid computing the same sub-solution multiple times in a recursive algorithm. A sub-solution of the problem is constructed from the previously found ones. DP solutions have a polynomial complexity, which ensures a much faster running time than other techniques like backtracking or brute-force. + +## Memoization - Top Down + +Memoization ensures that a method doesn’t run for the same inputs more than once by keeping a record of the results for the given inputs (usually in a hash map). + +To avoid duplicate work caused by recursion, we can use a cache that maps inputs to outputs. The approach involves: + +- Checking the cache to see if we can avoid computing the answer for any given input. +- Saving the results of any calculations to the cache. + +Memoization is a common strategy for dynamic programming problems where the solution is composed of solutions to the same problem with smaller inputs, such as the Fibonacci problem. + +Another strategy for dynamic programming is the **bottom-up** approach, which is often cleaner and more efficient. + +## Bottom-Up + +The bottom-up approach avoids recursion, saving the memory cost associated with building up the call stack. It "starts from the beginning" and works towards the final solution, whereas a recursive algorithm often "starts from the end and works backwards." + +## An Example - Fibonacci + +Let’s start with a well-known example: finding the \(n\)-th Fibonacci number. The Fibonacci sequence is defined as: + +\[ +F_n = F_{n−1} + F_{n−2}, \quad \text{with } F_0 = 0 \text{ and } F_1 = 1 +\] + +There are several approaches to solving this problem: + +### Recursion + +In a recursive approach, the function calls itself to compute the previous two Fibonacci numbers until reaching the base cases. + +```python +def fibonacci(n): + if (n == 0): + return 0 + if (n == 1): + return 1 + + return fibonacci(n - 1) + fibonacci(n - 2) +``` + +### Dynamic Programming + +- **Top-Down - Memoization:** + Recursion leads to unnecessary repeated calculations. Memoization solves this by caching the results of previously computed Fibonacci numbers, so they don't have to be recalculated. + +```python +cache = {} + +def fibonacci(n): + if (n == 0): + return 0 + if (n == 1): + return 1 + if (n in cache): + return cache[n] + + cache[n] = fibonacci(n - 1) + fibonacci(n - 2) + + return cache[n] +``` + +
+![Recursive vs Memoization](img/recursive_memoization.png){ width="90%" } +
Visualization of Recursive Memoization
+
+ + +- **Bottom-Up:** + The bottom-up approach eliminates recursion by computing the Fibonacci numbers in order, starting from the base cases and building up to the desired value. + +```python +cache = {} + +def fibonacci(n): + cache[0] = 0 + cache[1] = 1 + + for (i in range(2, n + 1)): + cache[i] = cache[i - 1] + cache[i - 2] + + return cache[n] +``` + +Additionally, this approach can be optimized further by using constant space and only storing the necessary partial results along the way. + +```python +def fibonacci(n): + fib_minus_2 = 0 + fib_minus_1 = 1 + + for (i in range(2, n + 1)): + fib = fib_minus_1 + fib_minus_2 + fib_minus_1, fib_minus_2 = fib, fib_minus_1 + + return fib +``` + +## How to Apply Dynamic Programming? + +To apply dynamic programming, follow these steps: + +- **Find the recursion in the problem:** Identify how the problem can be broken down into smaller subproblems. +- **Top-down approach:** Store the result of each subproblem in a table to avoid recomputation. +- **Bottom-up approach:** Find the correct order to evaluate the results so that partial results are available when needed. + +Dynamic programming generally works for problems that have an inherent left-to-right order, such as strings, trees, or integer sequences. If the naive recursive algorithm does not compute the same subproblem multiple times, dynamic programming won't be useful. \ No newline at end of file diff --git a/docs/dynamic-programming/greedy-algorithms.md b/docs/dynamic-programming/greedy-algorithms.md new file mode 100644 index 0000000..373a378 --- /dev/null +++ b/docs/dynamic-programming/greedy-algorithms.md @@ -0,0 +1,187 @@ +--- +title: Greedy Algorithms +tags: + - Dynamic Programming + - Greedy Algorithms +--- + +A *greedy algorithm* is an algorithm that follows the problem solving heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum. A greedy algorithm never takes back its choices, but directly constructs the final solution. For this reason, greedy algorithms are usually very efficient. + +The difficulty in designing greedy algorithms is to find a greedy strategy that always produces an optimal solution to the problem. The locally optimal choices in a greedy algorithm should also be globally optimal. It is often difficult to argue that a greedy algorithm works. + +## Coin Problem + +We are given a value \( V \). If we want to make change for \( V \) cents, and we have an infinite supply of each of the coins = { \( C_1, C_2, \dots, C_m \) } valued coins (sorted in descending order), what is the minimum number of coins to make the change? + +### Solution + +#### Approach: + +1. Initialize the result as empty. +2. Find the largest denomination that is smaller than the amount. +3. Add the found denomination to the result. Subtract the value of the found denomination from the amount. +4. If the amount becomes 0, then print the result. Otherwise, repeat steps 2 and 3 for the new value of the amount. + +```python +def min_coins(coins, amount): + n = len(coins) + for i in range(n): + while amount >= coins[i]: + # while loop is needed since one coin can be used multiple times + amount -= coins[i] + print(coins[i]) +``` +For example, if the coins are the euro coins (in cents) \({200, 100, 50, 20, 10, 5, 2, 1}\) and the amount is 548, the optimal solution is to select coins \(200+200+100+20+20+5+2+1\), whose sum is \(548\). + +
+![Coin Change Problem](img/coin_change.png){ width="90%" } +
Visualization of the Coin Change Problem
+
+ +In the general case, the coin set can contain any kind of coins, and the greedy algorithm does not necessarily produce an optimal solution. + +We can prove that a greedy algorithm does not work by showing a counterexample where the algorithm gives a wrong answer. In this problem, we can easily find a counterexample: if the coins are \({6, 5, 2}\) and the target sum is \(10\), the greedy algorithm produces the solution \(6+2+2\), while the optimal solution is \(5+5\). + +## Scheduling + +Many scheduling problems can be solved using greedy algorithms. A classic problem is as follows: + +We are given an array of jobs where every job has a deadline and associated profit if the job is finished before the deadline. It is also given that every job takes a single unit of time, thus the minimum possible deadline for any job is 1. How do we maximize total profit if only one job can be scheduled at a time? + +### Solution + +A simple solution is to generate all subsets of the given set of jobs and check each subset for feasibility. Keep track of maximum profit among all feasible subsets. The time complexity of this solution is exponential. This is a standard Greedy Algorithm problem. + +#### Approach: + +1. Sort all jobs in decreasing order of profit. +2. Initialize the result sequence as the first job in sorted jobs. +3. For the remaining \(n-1\) jobs: + - If the current job can fit in the current result sequence without missing the deadline, add the current job to the result. + - Else ignore the current job. + +```python +# sample job : ['x', 4, 25] −> [job_id, deadline, profit] +# jobs: array of 'job's +def print_job_scheduling(jobs, t): + n = len(jobs) + + # Sort all jobs according to decreasing order of profit + for i in range(n): + for j in range(n - 1 - i): + if jobs[j][2] < jobs[j + 1][2]: + jobs[j], jobs[j + 1] = jobs[j + 1], jobs[j] + + # To keep track of free time slots + result = [False] * t + # To store result (Sequence of jobs) + job = ['-1'] * t + + # Iterate through all given jobs + for i in range(len(jobs)): + # Find a free slot for this job + # (Note that we start from the last possible slot) + for j in range(min(t - 1, jobs[i][1] - 1), -1, -1): + # Free slot found + if result[j] is False: + result[j] = True + job[j] = jobs[i][0] + break + print(job) +``` + +## Tasks and Deadlines + +Let us now consider a problem where we are given \(n\) tasks with durations and deadlines, and our task is to choose an order to perform the tasks. For each task, we earn \(d - x\) points, where \(d\) is the task’s deadline and \(x\) is the moment when we finish the task. What is the largest possible total score we can obtain? + +For example, suppose the tasks are as follows: + +| Task | Duration | Deadline | +|------|----------|----------| +| A | 4 | 2 | +| B | 3 | 5 | +| C | 2 | 7 | +| D | 4 | 5 | + +An optimal schedule for the tasks is \( C, B, A, D \). In this solution, \( C \) yields 5 points, \( B \) yields 0 points, \( A \) yields -7 points, and \( D \) yields -8 points, so the total score is -10. + +Interestingly, the optimal solution to the problem does not depend on the deadlines, but a correct greedy strategy is to simply perform the tasks sorted by their durations in increasing order. + +### Solution + +1. Sort all tasks according to increasing order of duration. +2. Calculate the total points by iterating through all tasks, summing up the difference between the deadlines and the time at which the task is finished. + +```python +def order_tasks(tasks): + n = len(tasks) + + # Sort all task according to increasing order of duration + for (i in range(n)): + for (j in range(n - 1 - i)): + if (tasks[j][1] > tasks[j + 1][1]): + tasks[j], tasks[j + 1] = tasks[j + 1], tasks[j] + + point = 0 + current_time = 0 + # Iterate through all given tasks and calculate point + for (i in range(len(tasks))): + current_time = current_time + tasks[i][1] + point = point + (tasks[i][2] - current_time) + + print(point) +``` + +## Minimizing Sums + +We are given \(n\) numbers and our task is to find a value \(x\) that minimizes the sum: + +\[ +|a_1 − x|^c + |a_2 − x|^c + ... + |a_n − x|^c +\] + +We focus on the cases \(c = 1\) and \(c = 2\). + +### Case \(c = 1\) + +In this case, we should minimize the sum: + +\[ +|a_1 − x| + |a_2 − x| + ... + |a_n − x| +\] + +For example, if the numbers are \([1, 2, 9, 2, 6]\), the best solution is to select \(x = 2\), which produces the sum: + +\[ +|1 − 2| + |2 − 2| + |9 − 2| + |2 − 2| + |6 − 2| = 12 +\] + +In the general case, the best choice for \(x\) is the median of the numbers. For instance, the list \([1, 2, 9, 2, 6]\) becomes \([1, 2, 2, 6, 9]\) after sorting, so the median is 2. The median is an optimal choice because if \(x\) is smaller than the median, the sum decreases by increasing \(x\), and if \(x\) is larger, the sum decreases by lowering \(x\). Hence, the optimal solution is \(x = \text{median}\). + +### Case \(c = 2\) + +In this case, we minimize the sum: + +\[ +(a_1 − x)^2 + (a_2 − x)^2 + ... + (a_n − x)^2 +\] + +For example, if the numbers are \([1, 2, 9, 2, 6]\), the best solution is to select \(x = 4\), which produces the sum: + +\[ +(1 − 4)^2 + (2 − 4)^2 + (9 − 4)^2 + (2 − 4)^2 + (6 − 4)^2 = 46 +\] + +In the general case, the best choice for \(x\) is the average of the numbers. For the given example, the average is: + +\[ +\frac{(1 + 2 + 9 + 2 + 6)}{5} = 4 +\] + +This result can be derived by expressing the sum as: + +\[ +n x^2 − 2x(a_1 + a_2 + ... + a_n) + (a_1^2 + a_2^2 + ... + a_n^2) +\] + +The last part does not depend on \(x\), so we can ignore it. The remaining terms form a function with a parabola opening upwards, and the minimum value occurs at \(x = \frac{s}{n}\), where \(s\) is the sum of the numbers, i.e., the average of the numbers. \ No newline at end of file diff --git a/docs/dynamic-programming/img/1st_power_matrix.png b/docs/dynamic-programming/img/1st_power_matrix.png new file mode 100644 index 0000000..a26b31e Binary files /dev/null and b/docs/dynamic-programming/img/1st_power_matrix.png differ diff --git a/docs/dynamic-programming/img/3rd_power_matrix.png b/docs/dynamic-programming/img/3rd_power_matrix.png new file mode 100644 index 0000000..5be7c79 Binary files /dev/null and b/docs/dynamic-programming/img/3rd_power_matrix.png differ diff --git a/docs/dynamic-programming/img/coin_change.png b/docs/dynamic-programming/img/coin_change.png new file mode 100644 index 0000000..e7d82b7 Binary files /dev/null and b/docs/dynamic-programming/img/coin_change.png differ diff --git a/docs/dynamic-programming/img/left_childright_sibling.png b/docs/dynamic-programming/img/left_childright_sibling.png new file mode 100644 index 0000000..7e4c01a Binary files /dev/null and b/docs/dynamic-programming/img/left_childright_sibling.png differ diff --git a/docs/dynamic-programming/img/recursive_memoization.png b/docs/dynamic-programming/img/recursive_memoization.png new file mode 100644 index 0000000..514eb8a Binary files /dev/null and b/docs/dynamic-programming/img/recursive_memoization.png differ diff --git a/docs/dynamic-programming/index.md b/docs/dynamic-programming/index.md new file mode 100644 index 0000000..b23c10b --- /dev/null +++ b/docs/dynamic-programming/index.md @@ -0,0 +1,30 @@ +--- +title: Dynamic Programming +tags: + - Dynamic Programming +--- + +**Editor:** Halil Çetiner + +**Reviewers:** Onur Yıldız + +## Introduction +Next section is about the *Greedy Algorithms* and *Dynamic Programming*. It will be quite a generous introduction to the concepts and will be followed by some common problems. + +### [Greedy Algorithms](greedy-algorithms.md) +### [Dynamic Programming](dynamic-programming.md) +### [Common DP Problems](common-dp-problems.md) +### [Bitmask DP](bitmask-dp.md) +### [DP on Rooted Trees](dp-on-rooted-trees.md) +### [DP on Directed Acyclic Graphs](dp-on-dags.md) +### [Digit DP](./digit-dp.md) +### [Walk Counting using Matrix Exponentiation](./walk-counting-with-matrix.md) +### [Tree Child-Sibling Notation](./tree-child-sibling-notation.md) + +## References + +1. ["Competitive Programmer’s Handbook" by Antti Laaksonen - Draft July 3, 2018](https://cses.fi/book/book.pdf) +2. [Wikipedia - Dynamic Programming](https://en.wikipedia.org/wiki/Dynamic_programming) +3. [Topcoder - Competitive Programming Community / Dynamic Programming from Novice to Advanced](https://www.topcoder.com/community/competitive-programming/tutorials/dynamic-programming-from-novice-to-advanced/) +4. [Hacker Earth - Dynamic Programming](https://www.hackerearth.com/practice/algorithms/dynamic-programming/) +5. [Geeks for Geeks - Dynamic Programming](https://www.geeksforgeeks.org/dynamic-programming/) diff --git a/docs/dynamic-programming/tree-child-sibling-notation.md b/docs/dynamic-programming/tree-child-sibling-notation.md new file mode 100644 index 0000000..57ea2c8 --- /dev/null +++ b/docs/dynamic-programming/tree-child-sibling-notation.md @@ -0,0 +1,51 @@ +--- +title: Tree Child-Sibling Notation +tags: + - Dynamic Programming + - Tree Child-Sibling Notation +--- + +In this method, we change the structure of the tree. In a standard tree, each parent node is connected to all of its children. However, in the **child-sibling notation**, a node stores a pointer to only one of its children. Additionally, the node also stores a pointer to its immediate right sibling. + +In this notation, every node has at most 2 children: +- **Left child** (first child), +- **Right sibling** (first sibling). + +This structure is called the **LCRS (Left Child-Right Sibling)** notation. It effectively represents a binary tree, as every node has only two pointers (left and right). + +
+![Child-Sibling Notation](img/left_childright_sibling.png){ width="90%" } +
a tree notated with child-sibling notation
+
+ +## Why You Would Use the LCRS Notation + +The primary reason for using LCRS notation is to save memory. In the LCRS structure, less memory is used compared to the standard tree notation. + +### When You Might Use the LCRS Notation: + +- **Memory is extremely scarce.** +- **Random access to a node’s children is not required.** + +### Possible Cases for Using LCRS: + +1. **When storing a large multi-way tree in main memory:** + For example, [the phylogenetic tree](https://en.wikipedia.org/wiki/Phylogenetic_tree). + +2. **In specialized data structures where the tree is used in specific ways:** + For example, in the [**heap data structure**](https://en.wikipedia.org/wiki/Heap_%28data_structure%29), the main operations are: + + - Removing the root of the tree and processing each of its children, + - Joining two trees together by making one tree a child of the other. + +These operations can be done efficiently using the LCRS structure, making it convenient for working with heap data structures. + +--- + +## References + +- [LCRS article on Wikipedia](https://en.wikipedia.org/wiki/Left-child_right-sibling_binary_tree8) + +- [Link to the Figure used](https://contribute.geeksforgeeks.org/wp-content/uploads/new.jpeg) + +- [LCRS possible uses Stackoverflow](https://stackoverflow.com/questions/14015525/what-is-the-left-child-right-sibling-representation-of-a-tree-why-would-you-us) diff --git a/docs/dynamic-programming/walk-counting-with-matrix.md b/docs/dynamic-programming/walk-counting-with-matrix.md new file mode 100644 index 0000000..274ab72 --- /dev/null +++ b/docs/dynamic-programming/walk-counting-with-matrix.md @@ -0,0 +1,60 @@ +--- +title: Walk Counting using Matrix Exponentiation +tags: + - Dynamic Programming + - Walk Counting using Matrix Exponentiation +--- + +Matrix exponentiation can be used to count the number of walks of a given length on a graph. + +Let \( l \) be the desired walk length, and let \( A \) and \( B \) be nodes in a graph \( G \). If \( D \) is the adjacency matrix of \( G \), then \( D^l[A][B] \) represents the number of walks from node \( A \) to node \( B \) with length \( l \), where \( D^k \) denotes the \( k \)-th power of the matrix \( D \). + +--- + +## Explanation: + +- **Adjacency Matrix \( D \):** + In the adjacency matrix of a graph, each entry \( D[i][j] \) denotes whether there is a direct edge between node \( i \) and node \( j \). Specifically: + - \( D[i][j] = 1 \) if there is an edge from \( i \) to \( j \), + - \( D[i][j] = 0 \) otherwise. + +- **Matrix Exponentiation:** + To find the number of walks of length \( l \) between nodes \( A \) and \( B \), we need to compute \( D^l \), which is the \( l \)-th power of the adjacency matrix \( D \). The entry \( D^l[A][B] \) will then give the number of walks of length \( l \) from node \( A \) to node \( B \). + +```mermaid +graph LR + A(2) --> B(1); + B --> C(3); + C --> A; + C --> D(4); + D --> C; +``` + + +| ![D, adjacency matrix of G](img/1st_power_matrix.png){ width="50%" } | ![D^3, 3rd power of the matrix D](img/3rd_power_matrix.png){ width="50%" } | +|:--------------------------------------------------------------------:|:--------------------------------------------------------------------------:| +| D, adjacency matrix of G | D^3, 3rd power of the matrix D | + + +From the matrix \( D^3 \), we can see that there are 4 total walks of length 3. + +Let \( S \) be the set of walks, and let \( w \) be a walk where \( w = \{n_1, n_2, ..., n_k\} \) and \( n_i \) is the \( i \)-th node of the walk. Then: + +\[ +S = \{\{1, 3, 4, 3\}, \{3, 4, 3, 2\}, \{3, 4, 3, 4\}, \{4, 3, 4, 3\}\} +\] +and \( |S| = 4 \). + +Using fast exponentiation on the adjacency matrix, we can efficiently find the number of walks of length \( k \) in \( O(N^3 \log k) \) time, where \( N \) is the number of nodes in the graph. + +### Time Complexity Breakdown: +- [**Matrix Multiplication:**](https://en.wikipedia.org/wiki/Matrix_multiplication) The \( O(N^3) \) time complexity comes from multiplying two \( N \times N \) matrices. +- **Fast Exponentiation:** Fast exponentiation reduces the number of multiplications to \( \log k \), resulting in the overall time complexity of \( O(N^3 \log k) \). + +This method allows for efficiently calculating the number of walks with any length \( k \) in large graphs. + +--- + +## References + +- [Walk Counting on Sciencedirect](https://www.sciencedirect.com/science/article/pii/S0012365X08002008) \ No newline at end of file diff --git a/docs/graph/binary-search-tree.md b/docs/graph/binary-search-tree.md new file mode 100644 index 0000000..866726a --- /dev/null +++ b/docs/graph/binary-search-tree.md @@ -0,0 +1,156 @@ +--- +title: Binary Search Tree +tags: + - Tree + - Binary Search + - BST +--- + +A Binary tree is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child. + +For a binary tree to be a binary search tree, the values of all the nodes in the left sub-tree of the root node should be smaller than the root node's value. Also the values of all the nodes in the right sub-tree of the root node should be larger than the root node's value. + +
+![a simple binary search tree](img/binarytree.png) +
a simple binary search tree
+
+ +## Insertion Algorithm + +1. Compare values of the root node and the element to be inserted. +2. If the value of the root node is larger, and if a left child exists, then repeat step 1 with root = current root's left child. Else, insert element as left child of current root. +3. If the value of the root node is lesser, and if a right child exists, then repeat step 1 with root = current root's right child. Else, insert element as right child of current root. + +## Deletion Algorithm +- Deleting a node with no children: simply remove the node from the tree. +- Deleting a node with one child: remove the node and replace it with its child. +- Node to be deleted has two children: Find inorder successor of the node. Copy contents of the inorder successor to the node and delete the inorder successor. +- Note that: inorder successor can be obtained by finding the minimum value in right child of the node. + +## Sample Code + +```c +// C program to demonstrate delete operation in binary search tree +#include +#include + +struct node +{ + int key; + struct node *left, *right; +}; + +// A utility function to create a new BST node +struct node *newNode(int item) +{ + struct node *temp = (struct node *)malloc(sizeof(struct node)); + temp->key = item; + temp->left = temp->right = NULL; + return temp; +} + +// A utility function to do inorder traversal of BST +void inorder(struct node *root) +{ + if (root != NULL) + { + inorder(root->left); + printf("%d ", root->key); + inorder(root->right); + } +} + +/* A utility function to insert a new node with given key in BST */ +struct node* insert(struct node* node, int key) +{ + /* If the tree is empty, return a new node */ + if (node == NULL) return newNode(key); + + /* Otherwise, recur down the tree */ + if (key < node->key) + node->left = insert(node->left, key); + else + node->right = insert(node->right, key); + + /* return the (unchanged) node pointer */ + return node; +} + +/* Given a non-empty binary search tree, return the node with minimum + key value found in that tree. Note that the entire tree does not + need to be searched. */ +struct node * minValueNode(struct node* node) +{ + struct node* current = node; + + /* loop down to find the leftmost leaf */ + while (current->left != NULL) + current = current->left; + + return current; +} + +/* Given a binary search tree and a key, this function deletes the key + and returns the new root */ +struct node* deleteNode(struct node* root, int key) +{ + // base case + if (root == NULL) return root; + + // If the key to be deleted is smaller than the root's key, + // then it lies in left subtree + if (key < root->key) + root->left = deleteNode(root->left, key); + + // If the key to be deleted is greater than the root's key, + // then it lies in right subtree + else if (key > root->key) + root->right = deleteNode(root->right, key); + + // if key is same as root's key, then This is the node + // to be deleted + else + { + // node with only one child or no child + if (root->left == NULL) + { + struct node *temp = root->right; + free(root); + return temp; + } + else if (root->right == NULL) + { + struct node *temp = root->left; + free(root); + return temp; + } + + // node with two children: Get the inorder successor (smallest + // in the right subtree) + struct node* temp = minValueNode(root->right); + + // Copy the inorder successor's content to this node + root->key = temp->key; + + // Delete the inorder successor + root->right = deleteNode(root->right, temp->key); + } + return root; +} +``` + +## Time Complexity + +The worst case time complexity of search, insert, and deletion operations is $\mathcal{O}(h)$ where h is the height of Binary Search Tree. In the worst case, we may have to travel from root to the deepest leaf node. The height of a skewed tree may become $N$ and the time complexity of search and insert operation may become $\mathcal{O}(N)$. So the time complexity of establishing $N$ node unbalanced tree may become $\mathcal{O}(N^2)$ (for example the nodes are being inserted in a sorted way). But, with random input the expected time complexity is $\mathcal{O}(NlogN)$. + +However, you can implement other data structures to establish Self-balancing binary search tree (which will be taught later), popular data structures that implementing this type of tree include: + +- 2-3 tree +- AA tree +- AVL tree +- B-tree +- Red-black tree +- Scapegoat tree +- Splay tree +- Treap +- Weight-balanced tree diff --git a/docs/graph/bipartite-checking.md b/docs/graph/bipartite-checking.md new file mode 100644 index 0000000..be2d885 --- /dev/null +++ b/docs/graph/bipartite-checking.md @@ -0,0 +1,58 @@ +--- +title: Bipartite Checking +tags: + - Bipartite Checking + - Graph +--- + +The question is in the title. Is the given graph bipartite? We can use BFS or DFS on graph. Lets first focus on BFS related algorithm. This procedure is very similar to BFS, we have an extra color array and we assign a color to each vertex when we are traversing the graph. Algorithm proof depends on fact that BFS explores the graph level by level. If the graph contains an odd cycle it means that there must be a edge between two vertices that are in same depth (layer, proof can be found on [[1 - Algorithm Design, Kleinberg, Tardos]]()). Let's say the colors are red and black and we traverse the graph with BFS and assign red to odd layers and black to even layers. Then we check the edges to see if there exists an edge that its vertices are same color. If there is a such edge, the graph is not bipartite, else the graph is bipartite. + +
+![If two nodes x and y in the same layer are joined by an edge, then the cycle through x, y, and their lowest common ancestor z has odd length, demonstrating that the graph cannot be bipartite.](img/bipartite_check.png) +
If two nodes x and y in the same layer are joined by an edge, then the cycle through x, y, and their lowest common ancestor z has odd length, demonstrating that the graph cannot be bipartite.
+
+ +```cpp +typedef vector adjList; +typedef vector graph; +typedef pair ii; +enum COLOR {RED, GREEN}; +bool bipartite_check(graph &g){ + int root = 0; // Pick 0 indexed node as root. + vector visited(g.size(),false); + vector Color(g.size(),0); + queue Q( { {root,0}} ); // insert root to queue, it is first layer_0 + visited[root] = true; + Color[root] = RED; + while ( !Q.empty() ) + { + /*top.first is node, top.second its depth i.e layer */ + auto top = Q.front(); + Q.pop(); + for (int u : g[top.first]){ + if ( !visited[u] ){ + visited[u] = true; + //Mark even layers to red, odd layers to green + Color[u] = (top.second+1) % 2 == 0 ? RED : GREEN; + Q.push({u, top.second+1 }); + } + } + } + for(int i=0; i < g.size(); ++i){ + for( auto v: g[i]){ + if ( Color[i] == Color[v] ) return false; + } + } + return true; +} +int main(){ + graph g(3); + g[0].push_back(1); + g[1].push_back(2); + g[2].push_back(3); + cout << (bipartite_check(g) == true ? "YES" : "NO") << endl; + return 0; +} +``` + +The complexity of algorithm is is $O(V + E) + O(E) $, BFS and loop over edges. But we can say it $O(V+E)$ since it is Big-O notation. diff --git a/docs/graph/breadth-first-search.md b/docs/graph/breadth-first-search.md new file mode 100644 index 0000000..5409f24 --- /dev/null +++ b/docs/graph/breadth-first-search.md @@ -0,0 +1,78 @@ +--- +title: Breadth First Search +tags: + - Graph + - Breadth First Search + - BFS +--- + +Breadth First Search (BFS) is an algorithm for traversing or searching tree. (For example, you can find the shortest path from one node to another in an unweighted graph.) + +
+![Breadth First Search Traversal](img/bfs.jpg) +
An example breadth first search traversal
+
+ +## Method + +BFS is a traversing algorithm where you should start traversing from a selected node (source or starting node) and traverse the graph layerwise thus exploring the neighbour nodes (nodes which are directly connected to source node). You must then move towards the next-level neighbour nodes. [1] + +• As the name BFS suggests, you are required to traverse the graph breadthwise as follows: +• First move horizontally and visit all the nodes of the current layer +• Add to the queue neighbour nodes of current layer. +• Move to the next layer, which are in the queue + +Example question: Given a unweighted graph, a source and a destination, we need to find shortest path from source to destination in the graph in most optimal way? + +```cpp +#include +using namespace std; + +cont int MaxN=100005; // Max number of nodes 5 + +vector adj[MaxN]; +bool mark[MaxN]; + +void bfs(int starting_point,int ending_point) { + memset(mark,0,sizeof(mark)); //clear the cache + queue > q; // the value of node + // , and length between this node and the starting node + + q.push_back(make_pair(starting_point,0)); + mark[starting_point]=1; + + while(q.empty()==false) { + pair tmp = q.front(); // get the next node + q.pop(); // delete from q + + if(ending_point==tmp.first) { + printf("The length of path between %d - %d : %d\n", + starting_point,ending_point,tmp.second); + return; + } + + for (auto j : adj[tmp.first]) { + if(mark[j]) continue ; // if it reached before + mark[j]=1; + q.push_back(make_pair(j,tmp.second+1)); // add next node to queue + } + } +} + +int main() { + cin » n + + for (int i=0 ; i < m; i++) { + cin » a » b; + adj[a].push_back(b); + } + + cin » start_point » end_point; + bfs(start_point); + return 0; +} +``` + +## Complexity + +The time complexity of BFS is \(O(V + E)\), where \(V\) is the number of nodes and \(E\) is the number of edges. \ No newline at end of file diff --git a/docs/graph/bridges-and-articulation-points.md b/docs/graph/bridges-and-articulation-points.md new file mode 100644 index 0000000..03d3abf --- /dev/null +++ b/docs/graph/bridges-and-articulation-points.md @@ -0,0 +1,127 @@ +--- +title: Bridges and Articulation Points +tags: + - Bridge + - Articulation Point + - Cut Vertex + - Cut Edge + - Graph +--- + +## DFS Order + +**DFS order** is traversing all the nodes of a given graph by fixing the root node in the same way as in the DFS algorithm, but without revisiting a discovered node. An important observation here is that the edges and nodes we use will form a **tree** structure. This is because, for every node (**except the root**), we only arrive from another node, and for the **root** node, we do not arrive from any other node, thus forming a **tree** structure. + +```cpp +void dfs(int node){ + used[node] = true; + for(auto it : g[node]) + if(!used[it]) + dfs(it); +} +``` + +### Types of Edges + +When traversing a graph using DFS order, several types of edges can be encountered. These edges will be very helpful in understanding some graph algorithms. + +**Types of Edges:** +- **Tree edge:** These are the main edges used while traversing the graph. +- **Forward edge:** These edges lead to a node that has been visited before and is located in our own subtree. +- **Back edge:** These edges lead to nodes that have been visited before but where the DFS process is not yet complete. +- **Cross edge:** These edges lead to nodes that have been visited before and where the DFS process is already complete. + +An important observation about these edges is that in an undirected graph, it is impossible to have a cross edge. This is because it is not possible for an edge emerging from a node where the DFS process is complete to remain unvisited. + +
+![Green-colored edges are tree edges. Edge (1,8) is a forward edge. Edge (6,4) is a back edge. Edge (5,4) is a cross edge.](img/types-of-edges.png) +
Green-colored edges are tree edges. Edge (1,8) is a forward edge. Edge (6,4) is a back edge. Edge (5,4) is a cross edge.
+
+ +## Bridge + +In an **undirected** and **connected** graph, if removing an edge causes the graph to become disconnected, this edge is called a **bridge**. + +### Finding Bridges + +Although there are several algorithms to find bridges (such as **Chain Decomposition**), we will focus on **Tarjan's Algorithm**, which is among the easiest to implement and the fastest. + +When traversing a graph using DFS, if there is a **back edge** coming out of the subtree of the lower endpoint of an edge, then that edge is **not** a bridge. This is because the **back edge** prevents the separation of the subtree and its ancestors when the edge is removed. + +This algorithm is based exactly on this principle, keeping track of the minimum depth reached by the **back edge**s within the subtree of each node. + +If the minimum depth reached by the **back edge**s in the subtree of the lower endpoint of an edge is greater than or equal to the depth of the upper endpoint, then this edge is a **bridge**. This is because no **back edge** in the subtree of the edge's lower endpoint reaches a node above the current edge. Therefore, if we remove this edge, the subtree and its ancestors become disconnected. + +Using Tarjan's Algorithm, we can find all bridges in a graph with a time complexity of $\mathcal{O}(V + E)$, where $V$ represents the number of vertices and $E$ represents the number of edges in the graph. + +```cpp +int dfs(int node, int parent, int depth) { + int minDepth = depth; + dep[node] = depth; // dep dizisi her dugumun derinligini tutmaktadir. + used[node] = true; + for (auto it : g[node]) { + if (it == parent) + continue; + if (used[it]) { + minDepth = min(minDepth, dep[it]); + // Eger komsu dugum daha once kullanilmis ise + // Bu edge back edge veya forward edgedir. + continue; + } + int val = dfs(it, node, depth + 1); + // val degeri alt agacindan yukari cikan minimum derinliktir. + if (val >= depth + 1) + bridges.push_back({node, it}); + minDepth = min(minDepth, val); + } + return minDepth; +} +``` + +## Articulation Point + +In an undirected graph, if removing a node increases the number of connected components, that node is called an **articulation point** or **cut point**. + +
+![For example, if we remove node 0, the remaining nodes are split into two groups: 5 and 1, 2, 3, 4. Similarly, if we remove node 1, the nodes are split into 5, 0 and 2, 3, 4. Therefore, nodes 0 and 1 are **articulation points**.](img/cut-point.png) +
For example, if we remove node 0, the remaining nodes are split into two groups: 5 and 1, 2, 3, 4. Similarly, if we remove node 1, the nodes are split into 5, 0 and 2, 3, 4. Therefore, nodes 0 and 1 are **articulation points**.
+
+ +### Finding Articulation Points + +Tarjan's Algorithm for finding articulation points in an undirected graph: + +- Traverse the graph using DFS order. + +- For each node, calculate the depth of the minimum depth node that can be reached from the current node and its subtree through back edges. This value is called the **low** value of the node. + +- If the **low** value of any child of a non-root node is greater than or equal to the depth of the current node, then the current node is an **articulation point**. This is because no **back edge** in the subtree of this node can reach a node above the current node. Therefore, if this node is removed, its subtree will become disconnected from its ancestors. + +- If the current node is the root (the starting node of the DFS order) and there are multiple branches during the DFS traversal, then the root itself is an **articulation point**. This is because the root has multiple connected subgraphs. + +Using Tarjan's Algorithm, we can find all articulation points in a graph with a time complexity of $\mathcal{O}(V + E)$, where $V$ is the number of vertices and $E$ is the number of edges in the graph. + +```cpp +int dfs(int node, int parent, int depth) { + int minDepth = depth, children = 0; + dep[node] = depth; // dep array holds depth of each node. + used[node] = true; + for (auto it : g[node]) { + if (it == parent) + continue; + if (used[it]) { + minDepth = min(minDepth, dep[it]); + continue; + } + int val = dfs(it, node, depth + 1); + if (val >= depth and parent != -1) + isCutPoint[node] = true; + minDepth = min(minDepth, val); + children++; + } + // This if represents the root condition that we mentioned above. + if (parent == -1 and children >= 2) + isCutPoint[node] = true; + return minDepth; +} +``` diff --git a/docs/graph/cycle-finding.md b/docs/graph/cycle-finding.md new file mode 100644 index 0000000..a63a56d --- /dev/null +++ b/docs/graph/cycle-finding.md @@ -0,0 +1,37 @@ +--- +title: Cycle Finding +tags: + - Graph + - Cycle +--- + +**Cycle**: A sequence of nodes that returns to the starting node while visiting each node at most once and contains at least two nodes. + +We can use dfs order in order to find the graph has a cycle or not. + +If we find a back edge while traversing the graph then we can say that graph has a cycle. Because **back edge** connects the nodes at the top and bottom ends and causes a cycle. + +The algorithm that we are going to use to find the cycle in the directed graph: + +- Traverse the graph with dfs order. +- When you come to a node, color it gray and start visiting its neighbors. +- If one of the current node's neighbors is gray, then there is a cycle in the graph. Because a gray node is definitely an ancestor of the current node, and an edge to one of its ancestors is definitely a back edge. +- Once you're done visiting the neighbors, color the node black. + +```cpp +bool dfs(int node){ + // The color array holds the color of each node. + // 0 represents white, 1 represents gray, and 2 represents black. + color[node] = 1; + for(int i = 0; i < g[node].size(); i++){ + int child = g[node][i]; + if(color[child] == 1) + return true; + if(!color[child]) + if(dfs(child)) + return true; + } + color[node] = 2; + return false; +} +``` diff --git a/docs/graph/definitions.md b/docs/graph/definitions.md new file mode 100644 index 0000000..44ce9e3 --- /dev/null +++ b/docs/graph/definitions.md @@ -0,0 +1,62 @@ +--- +title: Graph Definitions +tags: + - Graph +--- + +## Definitions of Common Terms + +- **Node** - An individual data element of a graph is called Node. Node is also known as vertex. +- **Edge** - An edge is a connecting link between two nodes. It is represented as e = {a,b} Edge is also called Arc. +- **Adjacent** - Two vertices are adjacent if they are connected by an edge. +- **Degree** - a degree of a node is the number of edges incident to the node. +- **Undirected Graphs** - Undirected graphs have edges that do not have a direction. The edges indicate a two-way relationship, in that each edge can be traversed in both directions. +- **Directed Graphs** - Directed graphs have edges with direction. The edges indicate a one-way relationship, in that each edge can only be traversed in a single direction. +- **Weighted Edges** - If each edge of graphs has an association with a real number, this is called its weight. +- **Self-Loop** - It is an edge having the same node for both destination and source point. +- **Multi-Edge** - Some Adjacent nodes may have more than one edge between each other. + +## Walks, Trails, Paths, Cycles and Circuits + +- **Walk** - A sequence of nodes and edges in a graph. +- **Trail** - A walk without visiting the same edge. +- **Circuit** - A trail that has the same node at the start and end. +- **Path** - A walk without visiting same node. +- **Cycle** - A circuit without visiting same node. + +## Special Graphs + +- **Complete Graph** - A graph having at least one edge between every two nodes. +- **Connected Graph** - A graph with paths between every pair of nodes. +- **Tree** - an undirected connected graph that has any two nodes that are connected by exactly one path. There are some other definitions that you can notice it is tree: + - an undirected graph is connected and has no cycles. an undirected graph is acyclic, and a simple cycle is formed if any edge is added to the graph. + - an undirected graph is connected, it will become disconnected if any edge is removed. + - an undirected graph is connected, and has (number of nodes - 1) edges. + +### Bipartite Graphs + +A bipartite graph is a graph whose vertices can be divided into two disjoint and independent sets U and V such that every edge connects a vertex in U to one in V. Vertex sets U and V are usually called the parts of the graph. [[1]](https://en.wikipedia.org/wiki/Bipartite\_graph). The figure is shown in below. It is similar to graph coloring with two colors. Coloring graph with two colors is that every vertex have a corresponding color, and for any edge, it's vertices should be different color. In other words, if we can color neighbours two different colors, we can say that graph is bipartite. + +
+![Example bipartite graph, all edges satisfy the coloring constraint](img/bipartite.png) +
Example bipartite graph, all edges satisfy the coloring constraint
+
+ +We have some observations here. +- A graph 2- colorable if and only if it is bipartite. +- A graph does not contain odd-length cycle if and only if it is bipartite. +- Every tree is a bipartite graph since trees do not contain any cycles. + +### Directed Acyclic Graphs + +A directed acyclic graph(DAG) is a finite directed graph with no directed cycles. Equivalently, a DAG is a directed graph that has a topological ordering (we cover it in this bundle), a sequence of the vertices such that every edge is directed from earlier to later in the sequence [[2]](https://en.wikipedia.org/wiki/Directed_acyclic_graph). DAGs can be used to encode precedence relations or dependencies in a natural way [[3 - Algorithm Design, Kleinberg, Tardos]](). There are several applications using topological ordering directly such as finding critical path or automatic differentiation on computational graphs (this is extremely useful for deep learning frameworks [[4]](https://pytorch.org/docs/stable/autograd.html)). + +
+![Example Directed Acyclic Graphs](img/dag.png) +
Example Directed Acyclic Graphs
+
+ +
+![Example computational graph also a DAG, partial derivatives are written to edges respect to topological order](img/tree-def.png) +
Example computational graph also a DAG, partial derivatives are written to edges respect to topological order
+
diff --git a/docs/graph/depth-first-search.md b/docs/graph/depth-first-search.md new file mode 100644 index 0000000..c133b4a --- /dev/null +++ b/docs/graph/depth-first-search.md @@ -0,0 +1,97 @@ +--- +title: Depth First Search +tags: + - Graph + - Depth First Search + - DFS +--- + +Depth First Search (DFS) is an algorithm for traversing or searching tree. (For example, you can check if graph is connected or not via DFS) [2] + +
+![Depth First Search](img/dfs.jpg) +
Example of DFS traversal
+
+ +## Method + +The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It involves exhaustive searches of all the nodes by going ahead, if possible, else by backtracking. + +Here, the word backtrack means that when you are moving forward and there are no more nodes along the current path, you move backwards on the same path to find nodes to traverse. All the nodes will be visited on the current path till all the unvisited nodes have been traversed after which the next path will be selected. [3] + +```cpp +vector visited; +void dfs(int v) { + visited[v] = true; + for (int u : adj[v]) { + if (!visited[u]) dfs(u); + } +} +``` + +This recursive nature of DFS can be implemented using stacks. The basic idea is as follows: Pick a starting node and push all its adjacent nodes into a stack. Pop a node from stack to select the next node to visit and push all its adjacent nodes into a stack. Repeat this process until the stack is empty. However, ensure that the nodes that are visited are marked. This will prevent you from visiting the same node more than once. If you do not mark the nodes that are visited and you visit the same node more than once, you may end up in an infinite loop. [3] + +```cpp +DFS-iterative(G, s): //Where G is graph and s is source vertex let S be stack +S.push(s) //Inserting s in stack +mark s as visited. +while ( S is not empty): + //Pop a vertex from stack to visit next v = S.top( ) + S.pop( ) + //Push all the neighbours of v in stack that are not visited + for all neighbours w of v in Graph G: + if w is not visited : + S.push(w) + mark w as visited +``` + +Example Question: Given an undirected graph, find out whether the graph is strongly connected or not? An undirected graph is strongly connected if there is a path between any two pair of vertices. + + +```cpp +#include +using namespace std; + +cont int MaxN=100005; // Max number of nodes + +vector adj[MaxN]; +bool mark[MaxN]; + +void dfs(int k) { + mark[k]=1; // visited + for(auto j : adj[k]) // iterate over adjacent nodes + if(mark[j]==false) // check if it is visited or not + dfs(j); // do these operation for that node +} + +int main() { + cin » n » m; // number of nodes , number of edges + for (int i=0 ; i < m; i++){ + cin » a » b; + adj[a].push_back(b); + adj[b].push_back(a); + } + + dfs(1); + + bool connected=1; + for(int i=1 ; i <= n ;i++) + if(mark[i]==0) { + connected=0; + break; + } + + if(connected) + cout « "Graph is connected" « endl; + else + cout « "Graph is not connected" « endl; + + return 0; +} +``` + +## Complexity + +The time complexity of DFS is \(O(V+E)\) when implemented using an adjacency list ( with Adjacency Matrices it is \(O(V^2)\)), where \(V\) is the number of nodes and \(E\) is the number of edges. [4] diff --git a/docs/graph/heap.md b/docs/graph/heap.md new file mode 100644 index 0000000..2711201 --- /dev/null +++ b/docs/graph/heap.md @@ -0,0 +1,138 @@ +--- +title: Heap +tags: + - Heap + - Priority Queue +--- + +
+![a simple binary search tree](img/360px-Max-Heap.png) +
an example max-heap with 9 nodes
+
+ +The heap is a complete binary tree with N nodes, the value of all the nodes in the left and right sub-tree of the root node should be smaller than the root node's value. + +In a heap, the highest (or lowest) priority element is always stored at the root. A heap is not a sorted structure and can be regarded as partially ordered. As visible from the heap-diagram, there is no particular relationship among nodes on any given level, even among the siblings. Because a heap is a complete binary tree, it has a smallest possible height. A heap with $N$ nodes has $logN$ height. A heap is a useful data structure when you need to remove the object with the highest (or lowest) priority. + +## Implementation + +Heaps are usually implemented in an array (fixed size or dynamic array), and do not require pointers between elements. After an element is inserted into or deleted from a heap, the heap property may be violated and the heap must be balanced by internal operations. + +The first (or last) element will contain the root. The next two elements of the array contain its children. The next four contain the four children of the two child nodes, etc. Thus the children of the node at position n would be at positions $2*n$ and $2*n + 1$ in a one-based array. This allows moving up or down the tree by doing simple index computations. Balancing a heap is done by sift-up or sift-down operations (swapping elements which are out of order). So we can build a heap from an array without requiring extra memory. + +
+![example a heap as an array](img/Heap-as-array.png) +
example a heap as an array
+
+ +## Insertion + +Basically add the new element at the end of the heap. Then look it's parent if it is smaller or bigger depends on the whether it is max-heap or min-heap (max-heap called when Parents are always greater), swap with the parent. If it is swapped do the same operation for the parent. + +## Deletion + +If you are going to delete a node (root node or another one does not matter), + +1. Swap the node to be deleted with the last element of heap to maintain a balanced structure. +2. Delete the last element which is the node we want to delete at the start. +3. Now you have a node which is in the wrong place, You have to find the correct place for the swapped last element, to do this starting point you should check its left and right children, if one them is greater than our node you should swap it with the greatest child(or smallest if it is min-heap). +4. Still current node may in the wrong place, so apply Step 3 as long as it is not greater than its children(or smaller if it is min-heap). + +
+![](img/heap1.png) +![](img/heap2.png) +
an example deletion on a heap structure
+
+ +```py +class BinHeap: + def __init__(self): + self.heapList = [0] + self.currentSize = 0 + + def percUp(self,i): + while i // 2 > 0: + if self.heapList[i] < self.heapList[i // 2]: + tmp = self.heapList[i // 2] + self.heapList[i // 2] = self.heapList[i] + self.heapList[i] = tmp + i = i // 2 + + def insert(self,k): + self.heapList.append(k) + self.currentSize = self.currentSize + 1 + self.percUp(self.currentSize) + + def percDown(self,i): + while (i * 2) <= self.currentSize: + mc = self.minChild(i) + if self.heapList[i] > self.heapList[mc]: + tmp = self.heapList[i] + self.heapList[i] = self.heapList[mc] + self.heapList[mc] = tmp + i = mc + + def minChild(self,i): + if i * 2 + 1 > self.currentSize: + return i * 2 + else: + if self.heapList[i*2] < self.heapList[i*2+1]: + return i * 2 + else: + return i * 2 + 1 + + def delMin(self): + retval = self.heapList[1] + self.heapList[1] = self.heapList[self.currentSize] + self.currentSize = self.currentSize - 1 + self.heapList.pop() + self.percDown(1) + return retval + + def buildHeap(self,alist): + i = len(alist) // 2 + self.currentSize = len(alist) + self.heapList = [0] + alist[:] + while (i > 0): + self.percDown(i) + i = i - 1 + +bh = BinHeap() +bh.buildHeap([9,5,6,2,3]) + +print(bh.delMin()) +print(bh.delMin()) +print(bh.delMin()) +print(bh.delMin()) +print(bh.delMin()) +``` + +## Complexity + +Insertion $\mathcal{O}(logN)$, delete-min $\mathcal{O}(logN)$ , and finding minimum $\mathcal{O}(1)$. These operations depend on heap's height and heaps are always complete binary trees, basically the height is $logN$. (N is number of Node) + +## Priority Queue +Priority queues are a type of container adaptors, specifically designed so that its first element is always the greatest of the elements it contains, according to some strict weak ordering criterion. + +While priority queues are often implemented with heaps, they are conceptually distinct from heaps. A priority queue is an abstract concept like "a list" or "a map"; just as a list can be implemented with a linked list or an array, a priority queue can be implemented with a heap or a variety of other methods such as an unordered array. + +```cpp +#include // std::cout +#include // std::priority_queue +using namespace std; +int main () { + priority_queue mypq; + + mypq.push(30); + mypq.push(100); + mypq.push(25); + mypq.push(40); + + cout << "Popping out elements..."; + while (!mypq.empty()) { + cout << ' ' << mypq.top(); + mypq.pop(); + } + return 0; +} +``` diff --git a/docs/graph/img/360px-Max-Heap.png b/docs/graph/img/360px-Max-Heap.png new file mode 100644 index 0000000..2ee3822 Binary files /dev/null and b/docs/graph/img/360px-Max-Heap.png differ diff --git a/docs/graph/img/Heap-as-array.png b/docs/graph/img/Heap-as-array.png new file mode 100644 index 0000000..c0de22f Binary files /dev/null and b/docs/graph/img/Heap-as-array.png differ diff --git a/docs/graph/img/bfs.jpg b/docs/graph/img/bfs.jpg new file mode 100644 index 0000000..3aeead3 Binary files /dev/null and b/docs/graph/img/bfs.jpg differ diff --git a/docs/graph/img/biconnectivity.png b/docs/graph/img/biconnectivity.png new file mode 100644 index 0000000..a2c3783 Binary files /dev/null and b/docs/graph/img/biconnectivity.png differ diff --git a/docs/graph/img/binary-tree.png b/docs/graph/img/binary-tree.png new file mode 100644 index 0000000..6754630 Binary files /dev/null and b/docs/graph/img/binary-tree.png differ diff --git a/docs/graph/img/binarytree.png b/docs/graph/img/binarytree.png new file mode 100644 index 0000000..3b3303f Binary files /dev/null and b/docs/graph/img/binarytree.png differ diff --git a/docs/graph/img/bipartite.png b/docs/graph/img/bipartite.png new file mode 100644 index 0000000..42b8cb8 Binary files /dev/null and b/docs/graph/img/bipartite.png differ diff --git a/docs/graph/img/bipartite_check.png b/docs/graph/img/bipartite_check.png new file mode 100644 index 0000000..ac06a11 Binary files /dev/null and b/docs/graph/img/bipartite_check.png differ diff --git a/docs/graph/img/cut-point.png b/docs/graph/img/cut-point.png new file mode 100644 index 0000000..582d052 Binary files /dev/null and b/docs/graph/img/cut-point.png differ diff --git a/docs/graph/img/dag.png b/docs/graph/img/dag.png new file mode 100644 index 0000000..1aee76b Binary files /dev/null and b/docs/graph/img/dag.png differ diff --git a/docs/graph/img/dfs.jpg b/docs/graph/img/dfs.jpg new file mode 100644 index 0000000..293586c Binary files /dev/null and b/docs/graph/img/dfs.jpg differ diff --git a/docs/graph/img/directed_acyclic_graph.png b/docs/graph/img/directed_acyclic_graph.png new file mode 100644 index 0000000..0153999 Binary files /dev/null and b/docs/graph/img/directed_acyclic_graph.png differ diff --git a/docs/graph/img/first-flow.png b/docs/graph/img/first-flow.png new file mode 100644 index 0000000..dea58dc Binary files /dev/null and b/docs/graph/img/first-flow.png differ diff --git a/docs/graph/img/flow1.png b/docs/graph/img/flow1.png new file mode 100644 index 0000000..eff3156 Binary files /dev/null and b/docs/graph/img/flow1.png differ diff --git a/docs/graph/img/flow2.png b/docs/graph/img/flow2.png new file mode 100644 index 0000000..2c800c5 Binary files /dev/null and b/docs/graph/img/flow2.png differ diff --git a/docs/graph/img/flow3.png b/docs/graph/img/flow3.png new file mode 100644 index 0000000..f4e6689 Binary files /dev/null and b/docs/graph/img/flow3.png differ diff --git a/docs/graph/img/flow4.png b/docs/graph/img/flow4.png new file mode 100644 index 0000000..57aa3a0 Binary files /dev/null and b/docs/graph/img/flow4.png differ diff --git a/docs/graph/img/flow5.png b/docs/graph/img/flow5.png new file mode 100644 index 0000000..92fa64d Binary files /dev/null and b/docs/graph/img/flow5.png differ diff --git a/docs/graph/img/heap1.png b/docs/graph/img/heap1.png new file mode 100644 index 0000000..f609e43 Binary files /dev/null and b/docs/graph/img/heap1.png differ diff --git a/docs/graph/img/heap2.png b/docs/graph/img/heap2.png new file mode 100644 index 0000000..03323d1 Binary files /dev/null and b/docs/graph/img/heap2.png differ diff --git a/docs/graph/img/kruskal.jpg b/docs/graph/img/kruskal.jpg new file mode 100644 index 0000000..0112d6b Binary files /dev/null and b/docs/graph/img/kruskal.jpg differ diff --git a/docs/graph/img/mst.png b/docs/graph/img/mst.png new file mode 100644 index 0000000..75ca9db Binary files /dev/null and b/docs/graph/img/mst.png differ diff --git a/docs/graph/img/prim.png b/docs/graph/img/prim.png new file mode 100644 index 0000000..1191945 Binary files /dev/null and b/docs/graph/img/prim.png differ diff --git a/docs/graph/img/scc-graph.png b/docs/graph/img/scc-graph.png new file mode 100644 index 0000000..2084926 Binary files /dev/null and b/docs/graph/img/scc-graph.png differ diff --git a/docs/graph/img/scc.png b/docs/graph/img/scc.png new file mode 100644 index 0000000..5f49fc2 Binary files /dev/null and b/docs/graph/img/scc.png differ diff --git a/docs/graph/img/shortest.png b/docs/graph/img/shortest.png new file mode 100644 index 0000000..8d99026 Binary files /dev/null and b/docs/graph/img/shortest.png differ diff --git a/docs/graph/img/toporder.png b/docs/graph/img/toporder.png new file mode 100644 index 0000000..aca4793 Binary files /dev/null and b/docs/graph/img/toporder.png differ diff --git a/docs/graph/img/tree-def.png b/docs/graph/img/tree-def.png new file mode 100644 index 0000000..e51a346 Binary files /dev/null and b/docs/graph/img/tree-def.png differ diff --git a/docs/graph/img/types-of-edges.png b/docs/graph/img/types-of-edges.png new file mode 100644 index 0000000..4ff4134 Binary files /dev/null and b/docs/graph/img/types-of-edges.png differ diff --git a/docs/graph/index.md b/docs/graph/index.md new file mode 100644 index 0000000..ee058c9 --- /dev/null +++ b/docs/graph/index.md @@ -0,0 +1,37 @@ +--- +title: Graph +--- + +**Editor:** Kayacan Vesek + +**Reviewers:** Yasin Kaya + +### [Introduction](introduction.md) +### [Definitions](definitions.md) +### [Representing Graphs](representing-graphs.md) +### [Tree Traversals](tree-traversals.md) +### [Binary Search Tree](./binary-search-tree.md) +### [Heap](heap.md) +### [Depth First Search](depth-first-search.md) +### [Breadth First Search](breadth-first-search.md) +### [Cycle Finding](cycle-finding.md) +### [Bipartite Checking](bipartite-checking.md) +### [Union Find](union-find.md) +### [Shortest Path](shortest-path.md) +### [Minimum Spanning Tree](minimum-spanning-tree.md) +### [Topological Sort](topological-sort.md) +### [Bridges and Articulation Points](bridges-and-articulation-points.md) +### [Strong Connectivity and Biconnectivity](strong-connectivity-and-biconnectivity.md) +### [Strongly Connected Components](strongly-connected-components.md) +### [Max Flow](max-flow.md) + +## References + +1. [https://www.hackerearth.com/practice/algorithms/graphs/breadth-first-search/tutorial/](https://www.hackerearth.com/practice/algorithms/graphs/breadth-first-search/tutorial/) +2. [https://www.geeksforgeeks.org/depth-first-search-or-dfs-for-a-graph/](https://www.geeksforgeeks.org/depth-first-search-or-dfs-for-a-graph/) +3. [https://cp-algorithms.com/graph/depth-first-search.html](https://cp-algorithms.com/graph/depth-first-search.html) +4. [https://www.hackerearth.com/practice/algorithms/graphs/depth-first-search/tutorial/](https://www.hackerearth.com/practice/algorithms/graphs/depth-first-search/tutorial/) +5. [Shortest Path. Wikipedia, the free online encyclopedia. Retrieved January 5, 2019](https://www.wikiwand.com/en/articles/Shortest_path_problem) +6. [Topological sort. Geeksforgeeks website. Retrieved January 5, 2019](https://www.geeksforgeeks.org/topological-sorting/) +7. [Topological Sort. Wikipedia, the free online encyclopedia. Retrieved January 5, 2019](https://en.wikipedia.org/wiki/Topological_sorting) +8. [https://en.wikipedia.org/wiki/Graph_theory](https://en.wikipedia.org/wiki/Graph_theory) diff --git a/docs/graph/introduction.md b/docs/graph/introduction.md new file mode 100644 index 0000000..48bfb0d --- /dev/null +++ b/docs/graph/introduction.md @@ -0,0 +1,22 @@ +--- +title: Introduction +tags: + - Graph +--- + +A graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related". The objects correspond to the mathematical abstractions called vertices (also called nodes or points) and each of the related pairs of vertices is called an edge. Typically, a graph is depicted in diagrammatic form as a set of dots for the vertices, joined by lines for the edges. [8] + +Why graphs? Graphs are usually used to represent different elements that are somehow related to each other. + +A Graph consists of a finite set of vertices(or nodes) and set of edges which connect a pair of nodes. G = (V,E) + +V = set of nodes + +E = set of edges(e) represented as e = a,b + +Graph are used to show a relation between objects. So, some graphs may have directional edges (e.g. people and their love relationships that are not mutual: Alice may love Alex, while Alex is not in love with her and so on), and some graphs may have weighted edges (e.g. people and their relationship in the instance of a debt) + +
+![Directed Acyclic Graph](img/directed_acyclic_graph.png) +
Figure 1: a simple unweigted graph
+
\ No newline at end of file diff --git a/docs/graph/max-flow.md b/docs/graph/max-flow.md new file mode 100644 index 0000000..66da412 --- /dev/null +++ b/docs/graph/max-flow.md @@ -0,0 +1,134 @@ +--- +title: Max Flow +tags: + - Graph + - Max Flow + - Maximum Flow + - Ford Fulkerson +--- + +## Flow Network + +A flow network is a special type of directed graph that contains a single source and a single target node. In a flow network, each edge has a capacity, which indicates the maximum amount of flow that can pass through that edge. + +
+![One of the earliest examples of a flow network in history.](img/first-flow.png) +
One of the earliest examples of a flow network in history.
+
+ +## Maximum Flow + +Maximum flow is an algorithm that calculates the maximum amount of flow that can reach the target from the source in a flow network while maintaining a continuous flow. + +There are several algorithms to solve the Maximum Flow problem. The time complexities of some popular ones are: + +- Ford-Fulkerson algorithm: $\mathcal{O}(E * \text{flowCount})$ +- Edmonds-Karp algorithm: $\mathcal{O}(V * E^2)$ +- Dinic's algorithm: $\mathcal{O}(E * V^2)$ + +Where V is the number of vertices and E is the number of edges in the flow network. + +## Ford Fulkerson + +The steps of the Ford-Fulkerson maximum flow algorithm are as follows: + +- Find a path from the source to the target. +- The edge with the minimum capacity in the found path determines the flow that can pass through this path. +- Decrease the capacities of the edges in the path by the flow amount (the minimum capacity found in step 2) and add the reverse edges to the graph with a capacity equal to the flow. +- Repeat until there are no more paths from the source to the target. + +Why does this algorithm work? + +For example, let's assume we find a flow of size x through an edge from u to v. + +Suppose the path we found is $a \rightarrow ... \rightarrow u \rightarrow v \rightarrow ... \rightarrow b$. + +We will add a new edge from v to u with a capacity of x to our graph, but this newly added reverse edge does not exist in the original graph. + +After adding the reverse edges, the new path we find might look like $c \rightarrow ... \rightarrow v \rightarrow u \rightarrow ... \rightarrow d$, with a flow of size y. + +It is clear that $y \leq x$. + +We can represent three different valid flows as follows: + +- A flow of size y following the path $a \rightarrow ... \rightarrow u \rightarrow ... \rightarrow d$ +- A flow of size y following the path $c \rightarrow ... \rightarrow u \rightarrow ... \rightarrow b$ +- A flow of size x - y following the path $a \rightarrow ... \rightarrow u \rightarrow v \rightarrow ... \rightarrow d$ + +The overall time complexity of the Ford-Fulkerson algorithm is $\mathcal{O}(E * \text{flowCount})$ because, in the worst case, each found path increases the flow by only 1. Since finding each path takes time proportional to the number of edges, the complexity becomes $\mathcal{O}(E * \text{flowCount})$. + +However, if we implement the Ford-Fulkerson algorithm using BFS, the complexity changes. In this case, for every edge, the flows that consider this edge as the bottleneck will continually increase, leading to a time complexity of $\mathcal{O}(V * E^2)$. This specific implementation is known as the Edmonds-Karp Algorithm. + +
+![The figure on the left shows how much flow is passing through each edge. The figure on the right represents the current state of the graph.](img/flow1.png) +
The figure on the left shows how much flow is passing through each edge. The figure on the right represents the current state of the graph.
+
+ +
+![Flow = 7](img/flow2.png) +
Flow = 7
+
+ +
+![Flow = 8](img/flow3.png) +
Flow = 8
+
+ +
+![Flow = 13](img/flow4.png) +
Flow = 13
+
+ +
+![Flow = 15](img/flow5.png) +
Flow = 15
+
+ +```cpp +// c matrix holds the capacities of the edges. +// g adjacency list allows us to traverse the graph. +bool bfs() { + vector visited(n, false); + queue q; + q.push(source); + visited[source] = true; + while (!q.empty()) { + int node = q.front(); + q.pop(); + if (node == sink) + break; + for (int i = 0; i < g[node].size(); i++) { + int child = g[node][i]; + if (c[node][child] <= 0 or visited[child]) + continue; + visited[child] = true; + parent[child] = node; + q.push(child); + } + } + return visited[sink]; +} +int max_flow() { + while (bfs()) { + int curFlow = -1, node = sink; + while (node != source) { + // curFlow is the minimum capacity in the current path, i.e. the flow we found. + int len = c[parent[node]][node]; + if (curFlow == -1) + curFlow = len; + else + curFlow = min(curFlow, len); + node = parent[node]; + } + flow += curFlow; + node = sink; + while (node != source) { + c[parent[node]][node] -= curFlow; + // We are subtracting the flow we found from the path we found. + c[node][parent[node]] += curFlow; // We are adding the reverses of the edges + node = parent[node]; + } + } + return flow; +} +``` diff --git a/docs/graph/minimum-spanning-tree.md b/docs/graph/minimum-spanning-tree.md new file mode 100644 index 0000000..5596d18 --- /dev/null +++ b/docs/graph/minimum-spanning-tree.md @@ -0,0 +1,104 @@ +--- +title: Minimum Spanning Tree +tags: + - Graph + - Minimum Spanning Tree + - Prim + - Kruskal +--- + +## Definition + +Given an undirected weighted connected graph $G = (V,E)$ Spanning tree of G is a connected acyclic sub graph that covers all nodes and some edges. In a disconnected graph -where there is more than one connected component- the spanning tree of that graph is defined as the forest of the spanning trees of each connected component of the graph. + +Minimum spanning tree (MST) is a spanning tree in which the sum of edge weights is minimum. The MST of a graph is not unique in general, there might be more than one spanning tree with the same minimum cost. For example, take a graph where all edges have the same weight, then any spanning tree would be a minimum spanning tree. In problems involving minimum spanning trees where you have to output the tree itself (and not just the minimum cost), it either puts more constraint so the answer is unique, or simply asks for any minimum spanning tree. + +
+![Minimum Spanning Tree](img/mst.png) +
MST of the graph. It spans all nodes of the graph and it is connected.
+
+ +To find the minimum spanning tree of a graph, we will introduce two algorithms. The first one called Prim's algorithm, which is similar to Dijkstra's algorithm. Another algorithm is Kruskal agorithm, which makes use of the disjoint set data structure. Let's discover each one of them in detail! + +## Prim Algorithm + +Prim algorithm is very similar to Dijkstra's shortest path algorithm. In this algorithm we have a set $S$ which represents the explored nodes and again we can maintain a priority queue data structure the closest node in $V-S$. It is a greedy algorithm just like Dijkstra's shortest path algorithm. + +
+``` +G = (V, E) V set of all nodes, E set of all edges +T = {} result, edges of MST +S = {1} explored nodes +while S /= V do + let (u, v) be the lowest cost edge such that u in S and v in V - S; + T = T U {(u, v)} + S = S U {v} +end +``` +
Prim Algorithm in Pseudo code, what is the problem here?
+
+ +There is a problem with this implementation, it assumes that the graph is connected. If the graph is not connected this algorithm will be stuck on loop. There is a good visualization for Prim algorithm at [10]. If we use priority queue complexity would be $O(ElogV)$. + +
+![Prim's Algorithm](img/prim.png) +
Example of how Prim Algorithm constructs the MST
+
+ +## Kruskal Algorithm + +In Prim algorithm we started with a specific node and then proceeded with choosing the closest neighbor node to our current graph. In Kruskal algorithm, we follow a different strategy; we start building our MST by choosing one edge at a time, and link our (intially separated) nodes together until we connect all of the graph. + +To achieve this task, we will start with having all the nodes separated each in a group. In addition, we will have the list of edges from the original graph sorted based on their cost. At each step, we will: + +1. Pick the smallest available edge (that is not taken yet) +2. Link the nodes it connects together, by merging their group into one unified group +3. Add the cost of the edge to our answer + +However, you may realize in some cases the link we add will connect two nodes from the same group (because they were grouped before by other taken edges), hence violating the spanning tree condition (Acyclic) and more importantly introducing unnecessary edges that adds more cost to the answer. So to solve this problem, we will only add the edges as long as they connect two currently (at the time of processing this edge) separated nodes that belong to different groups, hence completing the algorithm. + +The optimality of Kruskal algorithm comes from the fact that we are taking from a sorted list of edges. For more rigorous proof please refer to [11]. + +So how can we effectively merge the group of nodes and check that which group each node belong? We can utilize disjoint set data structure which will help us to make union and find operations in an amortized constant $\mathcal{O}(1)$ time. + +```cpp +typedef pair> edge; +// represent edge as triplet (w,u,v) +// w is weigth, u and v verticies. +// edge.first is weigth edge.second.first -> u, edge.second.second -> v +typedef vector weigthed_graph; + +/*union - find data structure utilities */ +const int maxN = 3005; +int parent[maxN]; +int ssize[maxN]; +void make_set(int v); +int find_set(int v); +void union_sets(int a, int b); +void init_union_find(); + +/*Code that finds edges in MST */ +void kruskal(vector &edgeList ){ + vector mst; + init_union_find(); + sort(edgeList.begin(),edgeList.end(), \ + [](const auto &a, const auto &b) { return a.first< b.first;}); + //well this weird syntax is lambda function + // for sorting pairs to respect their first element. + for( auto e: edgeList){ + if( find_set(e.second.first )!= find_set(e.second.second)){ + mst.push_back(e); + union_sets(e.second.first, e.second.second); + } + } +} +``` + +To calculate the time complexity, observe how we first sorted the edges, this takes $\mathcal{O}(E log E)$. In addition we pass through the edges one by one, and each time we check which group the two nodes of the edge belongs to, and in some cases merge the two groups. So in the worst case we will assume that both operations (finding and merging) happens, but since the disjoint data structure guarantee $\mathcal{O}(1)$ amortized time for both operations, we end up with $\mathcal{O}(E)$ amortized time of processing the nodes. + +So in total we have $\mathcal{O}(E log E)$ from sorting edges and $\mathcal{O}(E)$ from processing them, those results in a total of $\mathcal{O}(E log E)$ (if you don't understand why please refer to the first bundle where we discuss time complexity). + +
+![Kruskal's Algorithm](img/kruskal.jpg) +
Example of how Kruskal Algorithm constructs the MST
+
diff --git a/docs/graph/representing-graphs.md b/docs/graph/representing-graphs.md new file mode 100644 index 0000000..17d9b72 --- /dev/null +++ b/docs/graph/representing-graphs.md @@ -0,0 +1,89 @@ +--- +title: Representing Graphs +tags: + - Graph +--- + +## Edge Lists + +A simple way to define edge list is that it has a list of pairs. We just have a list of objects consisting of the vertex numbers of 2 nodes and other attributes like weight or the direction of edges. [16] + +- **\+** For some specific algorithms you need to iterate over all the edges, (i.e. kruskal's algorithm) +- **\+** All edges are stored exactly once. +- **\-** It is hard to determine whether two nodes are connected or not. +- **\-** It is hard to get information about the edges of a specific vertex. + +```cpp +#include +#include +using namespace std; + +int main(){ + int edge_number; + vector > edges; + cin >> edge_number; + for( int i=0 ; i> a >> b; + edges.push_back(make_pair(a,b)); // a struct can be used if edges are weighted or have other properties. + } +} +``` + +## Adjacency Matrices + +Stores edges, in a 2-D matrix. matrix[a][b] keeps an information about road from a to b. [16] +- **\+** We can easily check if there is a road between two vertices. +- **\-** Looping through all edges of a specific node is expensive because you have to check all of the empty cells too. Also these empty cells takes huge memory in a graph which has many vertices. (For example representing a tree) + +```cpp +#include +#include +using namespace std; +int main(){ + int node_number; + vector > Matrix; + cin >> node_number; + for( int i=0 ; i ()); + int weight; + cin >>weight ; + Matrix[i].push_back(weight); + } +} +``` + +## Adjacency List + +Each node has a list consisting of nodes each is adjacent to. So, there will be no empty cells. Memory will be equal to number of edges. The most used one is in algorithms. [16] + +- **\+** You do not have to use space for empty cells. +- **\+** Easily iterate over all the neighbors of a specific node. +- **\-** If you want to check if two nodes are connected, in this form you still need to iterate over all the neighbors of one of them. But, there are some structures that you can do this operation in O(log N). For example if you won't add any edge, you can sort every vector with nodes' names, so you can find it by binary search. + +```cpp +#include +#include +using namespace std; + +int main(){ + int node_number,path_number; + + vector > paths; + // use object instead of int, + //if you need to store other features + + cin >> node_number >> path_number; + for( int i=0 ; i ()); + for( int j=0 ; j< path_number ; j++ ){ + int beginning_node, end_node; + cin >> beginning_node >> end_node; + + Matrix[ beginning_node ].push_back( end_node ); // push st + // Matrix[ end_node ].push_back( beginning_node ); + // ^^^ If edges are Undirected, you should push in reverse direction too + } +} +``` \ No newline at end of file diff --git a/docs/graph/shortest-path.md b/docs/graph/shortest-path.md new file mode 100644 index 0000000..a7b7911 --- /dev/null +++ b/docs/graph/shortest-path.md @@ -0,0 +1,68 @@ +--- +title: Shortest Path Problem +tags: + - Graph + - Shortest Path Problem + - Dijkstra +--- + +## Definition + +Let \(G(V,E)\) be a graph, \(v_i\) and \(v_j\) be two nodes of \(G\). We say a path between \(v_i\) and \(v_j\) is the shortest path if sum of the edge weights (cost) in the path is minimum. In other words, the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized. [5] + +
+![Shortest Path](img/shortest.png) +
Example shortest path in graph. Source is A and target is F. Image taken from [5].
+
+ +We will cover several shortest path algorithms in this bundle. One of them is Dijkstra’s Shortest Path Algorithm but it has some drawbacks: Edge weights should be non-negative for the optimally of the algorithm. We will discover other algorithms in which these condition isn’t necessary, like Floyd-Warshall and Bellman-Ford algorithms. + +## Dijkstra's Shortest Path Algorithm + +Dijkstra’s Shortest Path algorithm is straight forward. In brief we have a set \(S\) that contains explored nodes and \(d\) which contains the shortest path cost from source to another node. In other words, \(d(u)\) represents the shortest path cost from source to node \(u\). The procedure follows as that. First, add source node to set \(S\) which represents the explored nodes and assigns the minimum cost of the source to zero. Then each iteration we add node to \(S\) that has lowest cost \((d(u))\) from unexplored nodes. Let’s say \(S′ = V − S\) which means unexplored nodes. For all nodes in \(S′\) we calculate \(d(x)\) for each node \(x\) is \(S′\) then we pick minimum cost node and add it to \(S\). So how we calculate \(d(x)\)? For any \(x\) node from \(S′\), \(d(x)\) calculated as that, let’s say \(e\) cost of any edge from \(S\) to \(x\) then \(d(x) = min(d(u) + e)\). It is a greedy algorithm. + +Here is the explanation of the algorithm step by step. + +1. Initialize an empty set, distance array, insert source to set. + +2. Initialize a min-heap, put source to heap with key is zero. + +3. While heap is not empty, take the top element from heap and add its neighbours to min-heap. + +4. Once we pick an element from the heap, it is guaranteed that the same node will never be added to heap with lower key value. + +In implementation we can use priority queue data structure in order to increase efficiency. If we put unexplored nodes to min - priority queue where the distance is key, we can take the lowest cost unexplored node in \(O(log(n))\) time which is efficient. + +```cpp +typedef pair edge; +typedef vector adjList; +typedef vector graph; + +void dijkstra(graph &g, int s) { + vector dist(g.size(),INT_MAX/2); + vector visited(g.size(),false); + + dist[s] = 0; + + priority_queue, greater> q; + q.push({0, s}); + + while(!q.empty()) { + int v = q.top().second; + int d = q.top().first; + q.pop(); + + if(visited[v]) continue; + visited[v] = true; + + for(auto it: g[v]) { + int u = it.first; + int w = it.second; + if(dist[v] + w < dist[u]) { + dist[u] = dist[v] + w; + q.push({dist[u], u}); + } + } + } +} +``` \ No newline at end of file diff --git a/docs/graph/strong-connectivity-and-biconnectivity.md b/docs/graph/strong-connectivity-and-biconnectivity.md new file mode 100644 index 0000000..6d7d765 --- /dev/null +++ b/docs/graph/strong-connectivity-and-biconnectivity.md @@ -0,0 +1,24 @@ +--- +title: Strong Connectivity and Biconnectivity +tags: + - Strong Connectivity + - Biconnectivity + - Graph +--- + +## Strong Connectivity + +To reach a target node from a given node, it must be possible to arrive at the target by passing through a finite number of nodes. + +In an undirected graph, if every node is reachable from every other node, the graph is called **connected**. When the same concept is applied to directed graphs, it is called **strongly connected**. + +In other words, for a directed graph to be **strongly connected**, it must be possible to reach every other node from any given node. + +## Biconnectivity + +In an undirected graph, if the remaining graph remains connected when any node is removed, the graph is called **biconnected**. In other words, if the graph has no **articulation points**, it is considered a **biconnected** graph. + +
+![An example of biconnected graph](img/biconnectivity.png) +
An example of biconnected graph
+
diff --git a/docs/graph/strongly-connected-components.md b/docs/graph/strongly-connected-components.md new file mode 100644 index 0000000..0701f1a --- /dev/null +++ b/docs/graph/strongly-connected-components.md @@ -0,0 +1,65 @@ +--- +title: Strongly Connected Components +tags: + - Strongly Connected Components + - Graph +--- + +All directed graphs can be divided into disjoint subgraphs that are strongly connected. For two subgraphs to be disjoint, they must not share any common edges or nodes. If we consider each of these resulting subgraphs as a single node and create a new graph, the resulting graph will be a directed acyclic graph (DAG), meaning it will have no cycles. + +
+![The subgraphs marked in red are the strongly connected components.](img/scc.png) +
The subgraphs marked in red are the strongly connected components.
+
+ +
+![The newly formed graph, created by treating each strongly connected component as a single node, results in a directed acyclic graph (DAG), meaning it contains no cycles.](img/scc-graph.png) +
The newly formed graph, created by treating each strongly connected component as a single node, results in a directed acyclic graph (DAG), meaning it contains no cycles.
+
+ +Tarjan's Algorithm for finding strongly connected components (SCCs) in a directed graph (An alternative approach is Kosaraju's Algorithm, but Tarjan's algorithm is often preferred in practice due to its speed and simpler understanding): + +- Start traversing the graph using DFS order from any node and push the visited nodes onto a stack. Calculate the discovery time for each node. (Discovery time is the time unit when the node is first reached during DFS traversal, and we will call this the index.) +- If a node is in the stack, it is not yet part of any strongly connected component. This is because, when a strongly connected component is found, all the nodes belonging to that component are removed from the stack. +- For each node, calculate the index of the node that has the minimum index among the nodes reachable from the current node and its subtree through edges that do not belong to any strongly connected component. This value is called the "minimum reachable depth" from the subtree of the node (also known as the "low" value). +- If a node's low value is equal to its own index, then this node and all nodes below it in the stack form a strongly connected component. This is because if we call this node "u," there must be an edge from u's subtree back to u itself. Otherwise, u's low value would be clearly smaller than its own index. +- When a strongly connected component is found (as explained in the previous step), remove all nodes belonging to this component from the stack. + +Using Tarjan's Algorithm, we can find all strongly connected components in a graph with a time complexity of $\mathcal{O}(V + E)$, where $V$ is the number of vertices and $E$ is the number of edges. + +```cpp +void dfs(int node) { + low[node] = index[node] = ++curTime; + // curTime holds the discovery time of each node. + used[node] = true; + + st.push(node); + inStack[node] = true; + // inStack holds whether a node is in the stack or not. + for (auto it : g[node]) { + if (!used[it]) { + dfs(it); + low[node] = min(low[node], low[it]); + } else if (inStack[it]) + low[node] = min(low[node], index[it]); + // If the adjacent node is in the stack, then this edge can be a back edge. + } + if (low[node] == index[node]) { + while (1) { + int x = st.top(); + st.pop(); + cout << x << " "; + inStack[x] = false; + if (x == node) + break; + } + cout << endl; + } +} + +void scc() { + for (int i = 0; i < n; i++) + if (!used[i]) + dfs(i); +} +``` diff --git a/docs/graph/topological-sort.md b/docs/graph/topological-sort.md new file mode 100644 index 0000000..6a1e753 --- /dev/null +++ b/docs/graph/topological-sort.md @@ -0,0 +1,72 @@ +--- +title: Topological Sort +tags: + - Graph + - Topological Sort +--- + +## Definition + + +Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge u->v, vertex u comes before v in the ordering. Topological Sorting for a graph is not possible if the graph is not a DAG [6]. + +There are many important usages of topological sorting in computer science; applications of this type arise in instruction scheduling, ordering of formula cell evaluation when recomputing formula values in spreadsheets, logic synthesis, determining the order of compilation tasks to perform in makefiles, data serialization, and resolving symbol dependencies in linkers. It is also used to decide in which order to load tables with foreign keys in databases [7]. + +There are known algorithms (e.g Kahn’s algorithm) to find topological order in linear time. Below, you can find one of the implementations: + +
+![Topological Order](img/toporder.png) +
For example, a topological sorting of this graph is “5 4 2 3 1 0”. There can be more than one topological sorting for a graph. For example, another topological sorting of the following graph is “4 5 2 3 1 0”. The first vertex in topological sorting is always a vertex with in-degree as 0 (a vertex with no incoming edges)[6].
+
+ +## Algorithm + +```cpp +typedef vector adjList; +typedef vector graph; +typedef pair ii; + +void kahn(graph &g) { + vector result; + queue q; + vector degree(g.size(),0); // number of incoming egdes. + for(auto &list: g){ + for(auto &node:list) { + degree[node]++; + } + } + + for(int i=0; i < g.size(); ++i) { + if (degree[i] == 0) + q.push(i); + } + + while( !q.empty()) { + int node = q.front(); + result.push_back(node); + q.pop(); + + for (auto &ng: g[node]) { + degree[ng]--; + if (degree[ng] == 0) + q.push(ng); + } + } + + for(auto &i:result) + cout << i << " "; + cout << endl; +} +int main(){ + graph g(6); + g[1].push_back(0); + g[1].push_back(2); + g[2].push_back(3); + g[3].push_back(4); + g[4].push_back(5); + kahn(g); + return 0; +} +``` + +As for time complexity: we traverse all edges in the beginning (calculating degrees) and in the while segment we remove edges (once for an edge) and traverse all nodes. Hence, the time complexity of this algorithm is \(O(V +E)\). Note that this implementation assumes the graph is DAG. Try improving this code to support checking if the graph is DAG! diff --git a/docs/graph/tree-traversals.md b/docs/graph/tree-traversals.md new file mode 100644 index 0000000..18570f1 --- /dev/null +++ b/docs/graph/tree-traversals.md @@ -0,0 +1,80 @@ +--- +title: Tree Traversals +tags: + - Tree + - Preorder + - Postorder + - Inorder +--- + +The tree traversal is the process of visiting every node exactly once in a tree structure for some purposes(like getting information or updating information). In a binary tree there are some described order to travel, these are specific for binary trees but they may be generalized to other trees and even graphs as well. + +
+![a binary tree](img/binary-tree.png) +
a binary tree
+
+ +## Preorder Traversal + +Preorder means that a root will be evaluated before its children. In other words the order of evaluation is: Root-Left-Right + +``` +Preorder Traversal + Look Data + Traverse the left node + Traverse the right node +``` + +Example: 50 – 7 – 3 – 2 – 8 – 16 – 5 – 12 – 17 – 54 – 9 – 13 + +## Inorder Traversal +Inorder means that the left child (and all of the left child’s children) will be evaluated before the root and before the right child and its children. Left-Root-Right (by the way, in binary search tree inorder retrieves data in sorted order) + +``` +Inorder Traversal + Traverse the left node + Look Data + Traverse the right node +``` + +Example: 2 – 3 – 7 – 16 – 8 – 50 – 12 – 54 – 17 – 5 – 9 – 13 + +## Postorder Traversal +Postorder is the opposite of preorder, all children are evaluated before their root: Left-Right-Root + +``` +Postorder Traversal + Traverse the left node + Traverse the right node + Look Data +``` + +Example: 2 – 3 – 16 – 8 – 7 – 54 – 17 – 12 – 13 – 9 – 5 – 50 + +## Implementation + +```py +class Node: + def __init__(self,key): + self.left = None + self.right = None + self.val = key + +def printInorder(root): + if root: + printInorder(root.left) + print(root.val) + printInorder(root.right) + +def printPostorder(root): + if root: + printPostorder(root.left) + printPostorder(root.right) + print(root.val) + +def printPreorder(root): + if root: + print(root.val) + printPreorder(root.left) + printPreorder(root.right) +``` diff --git a/docs/graph/union-find.md b/docs/graph/union-find.md new file mode 100644 index 0000000..110aafa --- /dev/null +++ b/docs/graph/union-find.md @@ -0,0 +1,48 @@ +--- +title: Union Find +tags: + - Graph + - Union Find + - Disjoint Set Union + - DSU +--- + +A disjoint-set data structure is a data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. A union-find algorithm is an algorithm that performs two useful operations on such a data structure: [11, 12] + +- Find: Determine which subset a particular element is in. This can be used for determining if two elements are in the same subset. +- Union: Join two subsets into a single subset +- Union-Find Algorithm can be used to check whether an undirected graph contains cycle or not. This is another method based on Union-Find. This method assumes that graph doesn’t contain any self-loops. +- Most commonly used in kruskal's minumum spanning tree algorithm, it is used to check whether two nodes are in same connected component or not. [10] + +## Implementation + +```cpp +#include +using namespace std; + +cont int MaxN=100005; // Max number of nodes + +int ancestor[MaxN]; + +int parent(int k) // return the ancestor +{ + if(ancestor[k]==k) return k; + return ancestor[k] = parent(ancestor[k]); + // do not forget to equlize ancestor[k], it is going to decrease time complexity for the next operations +} + +int MakeUnion(int a,int b) // setting parent of root(a) as root(b). +{ + a = parent(a); + b= parent(b); + ancestor[a] = b; +} +int find(int a,int b) +{ + return parent(a)==parent(b); +} +``` + +## Complexity + +Using both path compression, splitting, or halving and union by rank or size ensures that the amortized time per operation is only $\mathcal{O}(\alpha (n))$, which is optimal, where $\alpha (n)$ is the inverse Ackermann function. This function has a value $\alpha (n)<5$ for any value of n that can be written in this physical universe, so the disjoint-set operations take place in essentially constant time. diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000..a63b734 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,23 @@ +--- +title: Algorithm Program +--- + +Algoritm Program contains lectures about algorithms and data structures which are prepared by inzva community, aimed at teaching advanced knowledge of algorithms to university students, spreading algorithmic thinking and providing training which will help them in international contests as well as in their professional lives. + +There is also a video playlist in Turkish about some of the algorithms and data structures in YouTube: + + + +## How to Use This Site + +- Lectures can be found by topics at the navigation bar. The sub topics can be found at those pages. Search bar is also available for finding pages by terms. +- In each lecture related problems and training sets from [algoleague.com](https://algoleague.com) are mentioned. Practicing those is highly recommended. + +## How to Contribute + +In order to contribute (adding new lecture, fixing any type of errors) below steps should be followed: + +1. Create an issue and briefly explain the purpose of your contribution. +2. Fork the repository with your personal account and apply your changes. +3. Create a pull request to master branch and add the link of pull request to issue. +4. After reviewing your pull request and discussion, your pull request will be merged. Thank you for your contribution! diff --git a/docs/introduction/index.md b/docs/introduction/index.md new file mode 100644 index 0000000..64ed11d --- /dev/null +++ b/docs/introduction/index.md @@ -0,0 +1,525 @@ +--- +title: Introduction +--- + +**Editor:** Muhammed Burak Buğrul + +**Reviewers:** Kadir Emre Oto & Yusuf Hakan Kalaycı + +## Introduction + +First of all, this is an intensive algorithm programme prepared by inzva, which includes lectures, contests, problem-solvings and a variety of practises. "Competitive Programming" term will be mentioned frequently in this programme, especially for it's community, help in progress in algorithms and data structures etc. + +Just for a quick cover up we will have a look at what happens when you compile and run a program, basic data types and functions. After that, we will examine C++ Standard Template Library(STL), time complexity and memory space. + +## Command Line + +A lot of people don't use command line if they can use an alternative. There are powerful IDEs (Integrated Development Environments) and they really make programming easier in some aspects. However, knowing how to use command line is important, especially for competitive programming. Firstly, it gives a low level knowledge and full control; secondly, every computer and environment has command line interface. + +In this document, you will find only a basic introduction to the command line, which is no more than the basic usage of a file system, compiler, and programs. + +There are a lot of differences between command line of Windows and Linux. But the differences between those of Mac and Linux are less. + +### Linux and Mac + +Mac users can use the built-in Terminal. You can find it by searching from Spotlight. Linux users can use gnome-terminal or any other installed one. Again, you can find them by using the built-in search tab. + +Some basic commands: + +- `ls` list files in current directory. Usage: `ls` +- `cd` change directory. Usage: `cd ~/Desktop` +- `mkdir` make a new directory. Usage: `mkdir directory_name` +- `mv` move command (cut). Usage: `mv source_path destination_path` +- `cp` copy command. Usage: `cp source_path destination_path` +- `rm` remove command. Usage: `rm file_path` + +You can read more about the unix command line at: [http://linuxcommand.org](http://linuxcommand.org) + +## Compiling and Executing Programs + +### G++ + +G++ is installed in Linux environments but in Mac, you should install Xcode first. + +You can compile your cpp souce file by typing `g++ source.cpp`. Default output of this command is `a.out`. + +### Running Executable Files + +For Linux and Mac, the command to run a program is `./program_name`. If you use default `g++` command, the name of your program will be `a.out`, so you should type `./a.out` in order to run it. + +### Closing a Program + +When you want to kill a program in running step, you can simply hit *Control + C*. + +When you want to suspend a program in running step, you can simply hit *Control + Z*. + +When you want to register EOF on standart input, you can simply hit *Control + D*. + +### Input/Output Redirection + +You can redirect input and output streams of a program by using command line and it is surprisingly easy. + +### Saving Output to a File + +The only thing you need is `>` symbol. Just add it at the end of your run command with the output file name: `./a.out > output.txt` + +**Note:** This redirection process creates `output.txt` if it doesn't exist; otherwise deletes all content in it, then writes into it. If you want to use `>` in appending mode you should use `>>` instead. + +### Reading Input from a File + +It is almost the same as output file redirection. The symbol is `<` now. Usage: `./a.out < input.txt` + +**This will make your job easier than copying and pasting input to test your program, especially in the contests.** + +### Using Both at the Same Time + +One of the wonderful things about these redirections is that they can be used at the same time. You can simply add both to the end of your run command: `./a.out < input.txt > output.txt` + +### pipe + +Sometimes, you may want to redirect the output of a program to another program as input. You can use the `|` symbol for this. Usage: `./program1 | ./program2` + +### diff + +As the name denotes, it can check two files line by line if they are the same or not. If not, it outputs different lines. Usage: `diff file1.txt file2.txt` + +It is very useful for comparing output of brute force solution and real solution. + +## Structs and Classes + +In almost every programming language, you can define your own data type. C++ has structs, classes; Python has dictionaries, classes etc. You can think of them as packets that store more than one different data and implement functions at the simplest level. They have a lot more abilities than these two (You can check OOP out). + +Let us examine a fraction struct written in C++. + +We need to store two values for a fraction, numerator and denominator. + +```c++ linenums="1" +struct Fraction { + int numerator, denominator; +}; +``` + +This is the simplest definition of a struct. Fraction struct contains two `int` variables. We call them members. So, Fraction struct has two members called numerator and denominator. + +```c++ linenums="1" +#include + +struct Fraction { + int numerator, denominator; +}; + +Fraction bigFraction(Fraction a, Fraction b) { + + if( a.numerator * b.denominator > a.denominator * b.numerator ) + return a; + + return b; +} + +int main() { + // Create two Fractions in order to compare them + Fraction a, b; + + a.numerator = 15; + a.denominator = 20; + + b.numerator = 12; + b.denominator = 18; + + // Create a new Fraction in order to store biggest of Fraction a and Fraction b. + fraction biggest = bigFraction(a, b); + + printf("The biggest fraction is %d / %d\n", biggest.numerator, biggest.denominator); + return 0; +} +``` + +Let us do the same in Python3: + +```py linenums="1" +class Fraction: + + def __init__(self, numerator, denominator): + self.numerator, self.denominator = numerator, denominator + +def bigFraction(a, b): + + if a.numerator * b.denominator > a.denominator * b.numerator: + return a + + return b + +a, b = Fraction(15, 20), Fraction(12, 18) # Create two Fractions in order to compare them +biggest = bigFraction(a, b) + +print(biggest.numerator, biggest.denominator) +``` + +In the sample codes above, `a`, `b` and `biggest` are called **objects** of `Fraction`. Also, the word **instance** can be used instead of object. + +### The Arrow Operator (C++) +Sometimes, usage of struct can change in C++. When you have a pointer to a struct, you should use `->` to access its members instead of `.` operator. If you still want to use `.` operator, you should do in this way: `(*ptr).member`. But arrow operator is simpler: `ptr->member`. + +## Big O Notation + +When dealing with algorithms or coming up with a solution, we need to calculate how fast our algorithm or solution is. We can calculate this in terms of number of operations. Big $\mathcal{O}$ notation moves in exactly at this point. Big $\mathcal{O}$ notation gives an upper limit to these number of operations. The formal definition of Big $\mathcal{O}$ is [1]. + +Let $f$ be a real or complex valued function and $g$ a real valued function, both defined on some unbounded subset of the real positive numbers, such that $g(x)$ is strictly positive for all large enough values of $x$. One writes: + +$$f(x) = \mathcal{O}{(g(x))} \ as\ x \rightarrow \infty$$ + +If and only if for all sufficiently large values of x, the absolute value of $f(x)$ is at most a positive constant multiple of $g(x)$. That is, $f(x)$ = $\mathcal{O}{(g(x))}$ if and only if there exists a positive real number $M$ and a real number $x_0$ such that: + +$$|f(x)| \leq Mg(x)\ for \ all\ x\ such\ that\ x_0 \leq x$$ + +In many contexts, the assumption that we are interested in the growth rate as the variable $x$ goes to infinity is left unstated, and one writes more simply that: + +$$f(x) = \mathcal{O}(g(x))$$ + +Almost every case for competitive programming, basic understanding of Big $\mathcal{O}$ notation is enough to decide whether to implement a solution or not. + +**Note:** Big $\mathcal{O}$ notation can be used for calculating both the run time complexity and the memory space used. + +## Recursion + +*Recursion* occurs when functions repeat themselves in order to create repeated applications or solve a problem by handling smaller situations first. There are thousands of examples in mathematics. One of the simple ones is *factorial* of $n$. It can be shown by $n!$ in mathematics and it gives the product of all positive integers from $1$ to $n$, for example, $4! = 1\cdot 2\cdot 3\cdot 4 = 24$. If we write factorial in a mathematical way, it will be: + +$$ +\begin{align*} + f(n) &= \begin{cases} + 1 & \text{if $n = 0$\,\, } \\ + n \cdot f(n - 1) & \text{if $n > 0$\,\,} + \end{cases} +\end{align*} +$$ + +The reason why we didn't simply write it as $f(n) = n \cdot f(n-1)$ is that it doesn't give sufficient information about function. We should know where to end the function calls, otherwise it can call itself infinitely. Ending condition is $n = 0$ here. We call it *base case*. Every recursive function needs at least one base case. + +So if we write every step of $f(4)$, it will be: + +$$ +\begin{align*} + 4! + &= 4\cdot f(3) && \text{recursive step} \\ + &= 4\cdot 3\cdot f(2) && \text{recursive step} \\ + &= 4\cdot 3\cdot 2\cdot f(1) && \text{recursive step} \\ + &= 4\cdot 3\cdot 2\cdot 1\cdot f(0) && \text{recursive step} \\ + &= 4\cdot 3\cdot 2\cdot 1\cdot 1 && \text{base case} \\ + &= 24 && \text{arithmetic} +\end{align*} +$$ + +Basically, we can apply this recursive logic into programming: + +```c++ linenums="1" +int factorial(int n) { + int result = 1; + for(int i = 1; i <= n; i++) + res *= i; + return result; +} +``` + +We can say a function is a recursive if it calls itself. Let us change this iterative factorial function into a recursive one. When you imagine how the recursive code will look like, you will notice it will look like the mathematical one: + +```c++ linenums="1" +int factorial(int n) { + if(n == 0) + return 1; + return n * factorial(n - 1); +} +``` + +Note that we didn't forget to put our base case into the recursive function implementation. + +### Time Complexity + +In case above, it can be seen that both recursive and iterative implementations of factorial function runs in $\mathcal{O}{(n)}$ time. But this equality doesn't occur always. Let us examine fibonacci function, it is mathematically defines as: + +$$ +\begin{align*} + f(n) &= \begin{cases} + 1 & \text{if $n = 0$ or $n = 1$\,\, } \\ + f(n - 1) + f(n - 2) & \text{if $n > 1$\,\,} + \end{cases} +\end{align*} +$$ + +We can implement this function with just one for loop: + +```c++ linenums="1" +int fibonacci(int n) { + int result = 1, previous = 1; + for (int i = 2; i <= n; i++) { + int tmp = result; + result += previous; + previous = tmp; + } + return result; +} +``` + +Again, we can implement recursive one according to the mathematical formula: + +```c++ linenums="1" +int fibonacci(int n) { + if( n == 0 || n == 1 ) + return 1; + return fibonacci(n - 1) + fibonacci(n - 2); +} +``` + +Let us calculate time complexity of iterative one. There are three basic operations inside a for loop that repeats $n-2$ times. So time complexity is $\mathcal{O}(n)$. But what about the recursive one? Let us examine its recursion tree(diagram of function calls) for $n = 5$ on [visualgo](https://visualgo.net/en/recursion). + +$f$ function called more than one for some values of n. Actually in every level, number of function calls doubles. So time complexity of the recursive implementation is $\mathcal{O}{(2^n)}$. It is far away worse than the iterative one. Recursive one can be optimized by techniques like memoization, but it is another topic to learn in further weeks. + +### Mutual Recursion + +Mutual recursion occurs when functions call each other. For example function `f` calls another function `g`, which also somehow calls `f` again. + +**Note:** When using mutual recursions in C++, don't forget to declare one of the functions so that the other function can know first one from its' prototype. + +**Note 2:** You can chain more than two functions and it will be still a mutual recursion. + +### Enumeration and Brute-Force + +**Enumeration** is numbering method on a set. + +For example, permutation is one of enumeration techniques. First permutation of numbers in range $1$ and $n$ is: + +$$1, 2, 3... n-1, n$$ + +And second one is: + +$$1, 2, 3... n, n-1$$ + +Finally, the last one is: + +$$n, n-1... 3, 2, 1$$ + +Additionally, we can try to enumerate all possible distributions of $n$ elements into 3 different sets. An example of a distribution of 5 elements can be represented as: + +$$1, 1, 2, 1, 3$$ + +In this distribution the first, the second and the fourth elements goes into the first set; third element goes into second set and the last element goes into the third set. + +Enumerations can be done with recursive functions easily. We will provide example implementations of 3-set one. But before examining recursive implementation, let us try to implement iterative one: + +```c++ linenums="1" +#include + +int main(){ + + for( int i=1 ; i<=3 ; i++ ) + for( int j=1 ; j<=3 ; j++ ) + for( int k=1 ; k<=3 ; k++ ) + for( int l=1 ; l<=3 ; l++ ) + for( int m=1 ; m<=3 ; m++ ) + printf("%d %d %d %d %d\n", i, j, k, l, m); + + return 0; +} +``` + +It will print all possible distributions of 5 elements into 3 sets. But what if we had 6 elements? Yes, we should have added another for loop. What if we had $n$ elements? We can not add infinite number of for loops. But we can apply same logic with recursive functions easily: + +```c++ linenums="1" +#include + +int ar[100]; + +void enumerate( int element, int n ){ + + if( element > n ){ // Base case + + for( int i=1 ; i<=n ; i++ ) + printf("%d ", ar[i]); + + printf("\n"); + return; + } + + for( int i=1 ; i<=3 ; i++ ){ + ar[element] = i; + enumerate(element + 1, n); + } +} + +int main(){ + enumerate(1, 5); + return 0; +} +``` + +**Brute-Force** is trying all cases in order to achieve something(searching best, shortest, cheapest etc.). + +One of the simplest examples of brute-forces approaches is primality checking. We know that for a prime $P$ there is no positive integer in range $[2, P-1]$ that evenly divides $P$. We can simply check all integers in this range to decide if it is prime: + +```c++ linenums="1" +bool isPrime(int N) { + for( int i=2 ; i + +using namespace std; + +int main(){ + + pair p(1, 2); + pair p2; + + p2.first = 3; + p2.second = "Hey there!"; + + pair, string> nested; + + nested.first = p2; + nested.second = "This is a nested one"; + + cout << "Info of p -> " << p.first << " " << p.second << endl; + cout << "Info of p2 -> " << p2.first << " " << p2.second << endl; + cout << "Info of nested -> " << nested.first.first << " " << nested.first.second + << " " << nested.second << endl; + + return 0; +} +``` + +**Python:** You can simply create a tuple or an array: + +```py linenums="1" +p = (1, 2) +p2 = [1, "Hey there!"] +nested = ((3, "inner?"), "outer", "this is a tuple you can add more") + +p2[0] = 3 +# nested[0] = "don't" # In python you can't change tuples, but you can change arrays + +print(p, p2, nested) +``` + +### Vectors + +**C++:** When using array, we should decide its size. What if we don't have to do this, what if we could add elements into it without considering the current size? Well, all these ideas take us to vectors. + +C++ has this structure. Its name is vector. It is a dynamic array but you don't have to think about its size. You can simply add elements into it. Like pairs, you can use it with any type (int, double, another vector, your struct/class etc.). Usage of a vector is very similar to classic array: + +```c++ linenums="1" +#include +#include + +using namespace std; + +int main(){ + + vector ar; + + for( int i=0 ; i<10 ; i++ ) + ar.push_back(i); + + for( int i=0 ; i<(int)ar.size() ; i++ ) + cout << ar[i] << " "; + + cout << endl; + return 0; +} +``` + +**Python:** Python lists already behave like vectors: + +```py linenums="1" +ar = [] + +for i in range(10): + ar.append(i) + +print(ar) +``` + +### Stacks, Queues, and Deques + +**C++:** They are no different than stack, queue and deque we already know. It provides the implementation, you can simply include the libraries and use them. See [queue](http://www.cplusplus.com/reference/queue/queue/), [stack](http://www.cplusplus.com/reference/stack/stack/), [deque](http://www.cplusplus.com/reference/deque/deque/). + +### Priority Queues +It is basically a built-in heap structure. You can add an element in $\mathcal{O}(logN)$ time, get the first item in $O(logN)$ time. The first item will be decided according to your choice of priority. This priority can be magnitude of value, enterence time etc. + +**C++:** The different thing for priority queue is you should add `#include `, not ``. You can find samples [here](http://www.cplusplus.com/reference/queue/priority_queue/). Again, you can define a priority queue with any type you want. + +**Note:** Default `priority_queue` prioritizes elements by highest value first. [Here](https://en.cppreference.com/w/cpp/container/priority_queue) is three ways of defining priority. + +**Python:** You can use [heapq](https://docs.python.org/2/library/heapq.html) in python. + +### Sets and Maps + +**C++:** Now that we mentioned binary trees (heap above), we can continue on built-in self balanced binary trees. Sets are key collections, and maps are key-value collections. Sets are useful when you want to add/remove elements in $\mathcal{O}(logN)$ time and also check existence of an item(key) in $O(logN)$ time. Maps basically do the same but you can change value associated to a key without changing the position of the key in the tree. You can check c++ references for [set](http://www.cplusplus.com/reference/set/set/) and [map](http://www.cplusplus.com/reference/map/map/map/). You can define them with any type you want. If you want to use them with your own struct/class, you must implement a compare function. + +**Python:** You can use dictionaries for [map](https://docs.python.org/3/tutorial/datastructures.html#dictionaries) and [sets](https://docs.python.org/3/tutorial/datastructures.html#sets) for set in python without importing any other libraries. + +### Iterators + +**C++:** You can use [iterators](http://www.cplusplus.com/reference/iterator/) for every built-in data structure in C++ for pointing their objects. + +**Python:** You can iterate through any iterable in python by using **in**. You can check [this](https://wiki.python.org/moin/ForLoop) example. + +### Sorting + +**C++:** In almost every language, there is a built-in [sort](http://www.cplusplus.com/reference/algorithm/sort/) function. C++ has one as well. It runs in $\mathcal{O}(N log N)$ time. You can pass your own compare function into sort function of C++. + +**Python:** You can use [sort](https://docs.python.org/3/howto/sorting.html) function for any list or list like collection in order to sort them or if you don't want to change the original collection, you can use `sorted()` function instead. You can pass your own compare function into sort function of python by using key variable. + +## Suggested Readings + +### C++ + +- **next_permutation:** [Link](http://www.cplusplus.com/reference/algorithm/next_permutation/) +- **STL document:** [Link](http://www.cplusplus.com/reference/stl/) +- **binary_search:** [Link](http://www.cplusplus.com/reference/algorithm/binary_search/) +- **upper_bound:** [Link](http://www.cplusplus.com/reference/algorithm/upper_bound/) +- **lower_bound:** [Link](http://www.cplusplus.com/reference/algorithm/lower_bound/) +- **reverse:** [Link](http://www.cplusplus.com/reference/algorithm/reverse/) +- **fill:** [Link](http://www.cplusplus.com/reference/algorithm/fill/) +- **count:** [Link](http://www.cplusplus.com/reference/algorithm/count/) + +### Python + +- **bisect:** [Link](https://docs.python.org/3.0/library/bisect.html) +- **collections:** [Link](https://docs.python.org/3/library/collections.html) +- **built-in functions:** [Link](https://docs.python.org/3/library/functions.html) +- **lambda:** [Link](https://www.w3schools.com/python/python_lambda.asp) + +## References + +1. Landau, Edmund (1909). Handbuch der Lehre von der Verteilung der Primzahlen [Handbook on the theory of the distribution of the primes] (in German). Leipzig: B. G. Teubner. p. 31. \ No newline at end of file diff --git a/docs/overrides/partials/content.html b/docs/overrides/partials/content.html new file mode 100644 index 0000000..0472270 --- /dev/null +++ b/docs/overrides/partials/content.html @@ -0,0 +1,24 @@ + +

+ {% if page.meta.title %} + {{ page.meta.title }} + {% else %} + {{ page.title }} + {% endif %} +

+ +
+ + +{{ page.content }} + + +{% if page.meta.editors|length or page.meta.reviewers|length %} +
+ {% if page.meta.editors|length %} +

Editors: {{page.meta.editors|join(", ")}}

+ {% endif %} + {% if page.meta.reviewers|length %} +

Reviewers: {{page.meta.reviewers|join(", ")}}

+ {% endif %} +{% endif %} \ No newline at end of file diff --git a/docs/static/img/favicon.png b/docs/static/img/favicon.png new file mode 100644 index 0000000..15c2cae Binary files /dev/null and b/docs/static/img/favicon.png differ diff --git a/docs/static/img/logo.png b/docs/static/img/logo.png new file mode 100644 index 0000000..c2928d2 Binary files /dev/null and b/docs/static/img/logo.png differ diff --git a/docs/static/javascripts/katex.js b/docs/static/javascripts/katex.js new file mode 100644 index 0000000..a9417bf --- /dev/null +++ b/docs/static/javascripts/katex.js @@ -0,0 +1,10 @@ +document$.subscribe(({ body }) => { + renderMathInElement(body, { + delimiters: [ + { left: "$$", right: "$$", display: true }, + { left: "$", right: "$", display: false }, + { left: "\\(", right: "\\)", display: false }, + { left: "\\[", right: "\\]", display: true } + ], + }) +}) \ No newline at end of file diff --git a/docs/static/stylesheets/main.css b/docs/static/stylesheets/main.css new file mode 100644 index 0000000..029fe9f --- /dev/null +++ b/docs/static/stylesheets/main.css @@ -0,0 +1,4 @@ +.md-sidebar--primary { + display: none; + visibility: hidden; +} diff --git a/guidelines/latex/template/README.md b/guidelines/latex/template/README.md index d692364..e8b158f 100644 --- a/guidelines/latex/template/README.md +++ b/guidelines/latex/template/README.md @@ -1,3 +1,9 @@ # Algorithm-Competition-Programme LaTeX Template -You should download and install **Pygments** in order to work on this template. For any problem or question, you can send a mail to contact@inzva.com \ No newline at end of file +## Quickstart + +To quickly start using the LaTeX template for either creating new bundle or editing the existing ones, you can create a new project in [Overleaf](https://overleaf.com) and import the LaTeX files there. + +Instead of using an online environment, if you want to compile the template locally, you should download and install **Pygments** in order to work on this template. + +For any problem or question, you can send a mail to contact@inzva.com \ No newline at end of file diff --git a/mkdocs.yml b/mkdocs.yml new file mode 100644 index 0000000..c40f875 --- /dev/null +++ b/mkdocs.yml @@ -0,0 +1,90 @@ +site_name: Algorithm Program +site_url: https://inzva.github.io/Algorithm-Program/ +nav: + - Home: index.md + - Introduction: introduction/index.md + - Data Structures: data-structures/index.md + - Algorithms: algorithms/index.md + - Graph: graph/index.md + - Dynamic Programming: dynamic-programming/index.md +theme: + name: material + custom_dir: docs/overrides + favicon: static/img/favicon.png + logo: static/img/logo.png + features: + - toc.follow + - navigation.tabs + - search.suggest + - search.highlight + - content.tabs.link + - content.code.annotation + - content.code.copy + language: en + palette: + - scheme: default + toggle: + icon: material/weather-night + name: Switch to dark mode + primary: black + accent: grey + - scheme: slate + toggle: + icon: material/weather-sunny + name: Switch to light mode + primary: black + accent: white + +extra: + social: + - icon: fontawesome/brands/github-alt + link: https://github.com/inzva + - icon: fontawesome/brands/twitter + link: https://twitter.com/inzvaspace + - icon: fontawesome/brands/instagram + link: https://instagram.com/inzva.space/ + - icon: fontawesome/brands/linkedin + link: https://linkedin.com/company/inzva/ + +extra_javascript: + - static/javascripts/katex.js + - https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.7/katex.min.js + - https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.7/contrib/auto-render.min.js + +extra_css: + - static/stylesheets/main.css + - https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.7/katex.min.css + +markdown_extensions: + - pymdownx.highlight: + linenums: true + anchor_linenums: true + - pymdownx.inlinehilite + - pymdownx.snippets + - admonition + - pymdownx.arithmatex: + generic: true + block_tag: "p" + - footnotes + - pymdownx.details + - pymdownx.superfences: + custom_fences: + - name: mermaid + class: mermaid + format: !!python/name:pymdownx.superfences.fence_code_format + - pymdownx.mark + - attr_list + - pymdownx.emoji: + emoji_index: !!python/name:material.extensions.emoji.twemoji + emoji_generator: !!python/name:material.extensions.emoji.to_svg + - toc: + permalink: true + - md_in_html + - tables + +plugins: + - search + - tags + +copyright: | + © 2024 inzva