[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / raw/concept/kbhoptimizing_spark.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "Optimizing Spark" source: https://www.jemoka.com/posts/kbhoptimizing_spark/ --- In the event your domain knowledge can help you make decisions about how spark load-balances or stripes data across worker nodes. Persistence “you should store this data in faster/slower memory” MEMORY_ONLY, MEMORY_ONLY_SER, MEMORY_AND_DISK, MEMORY_AND_DISK_SER, DISK_ONLY rdd.persist(StorageLevel.MEMORY_AND_DISK) # ... do work ... rdd.unpersist() Parallel Programming